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Preface 



For the second time, the European Software Engineering Conference is being held 
jointly with the ACM SIGSOFT Symposium on the Foundations of Software Engineer- 
ing (FSE). Although the two conferences have different origins and traditions, there is 
a significant overlap in intent and subject matter. Holding the conferences jointly when 
they are held in Europe helps to make these thematic links more explicit, and encour- 
ages researchers and practitioners to attend and submit papers to both events. 

The ESEC proceedings have traditionally been published by Springer-Verlag, as 
they are again this year, but by special arrangement, the proceedings will be distributed 
to members of ACM SIGSOFT, as is usually the case for FSE. 

ESEC/FSE is being held as a single event, rather than as a pair of collocated events. 
Submitted papers were therefore evaluated by a single program committee. ESEC/FSE 
represents a broad range of software engineering topics in (mainly) two continents, and 
consequently the program committee members were selected to represent a spectrum of 
both traditional and emerging software engineering topics. A total of 141 papers were 
submitted from around the globe. Of these, nearly half were classified as research pa- 
pers, a quarter as experience papers, and the rest as both research and experience papers. 
Twenty-nine papers from five continents were selected for presentation and inclusion 
in the proceedings. Due to the large number of industrial experience reports submitted, 
we have also introduced this year two sessions on short case study presentations. 

This year we also have several invited talks from outside the software engineering 
mainstream. Brian Jones, who, together with Bertrand Piccard, succeeded this year for 
the first time ever to circumnavigate the globe with a balloon, has accepted the invita- 
tion to speak to the ESEC/ESE audience about Managing the Last Human Challenge of 
the 20th Century: Some Right Ways! Kent Beck will speak about Extreme Program- 
ming: A Discipline of Software Development. Krzysztof Czarnecki will present the in- 
vited paper, co-authored with Ulrich Eisenecker, on Components and Generative Pro- 
gramming. Einally, Niklaus Wirth will be accepting the ACM SIGSOET Outstanding 
Research Award for his contributions to Software Engineering and Programming Lan- 
guages, and he will speak about From Programming to Software Engineering — and 
Back. 

The paper selection process was largely comparable to that used by previous ESEC 
and FSE program committees. Each of the papers submitted to the conference was eval- 
uated by at least three program committee members. Due to the extremely broad range 
of topics covered, checks were introduced to ensure that most papers were reviewed by 
at least two experts in the field. When reviewers claimed inadequate expertise, addition- 
al reviews were solicited, in some cases from outside experts. Conflicts of interest were 
specially handled according to established ACM rules; program committee members 
who were authors of submitted papers, or who had recently worked with the authors, 
did not have access to reviews of the concerned papers, and left the meeting room when 
the papers were discussed. (The rate of acceptance of papers submitted by program 
committee members was, incidentally, comparable to the overall rate of acceptance.) 
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Preface 



For the first time, ESEC/FSE has introduced the CyberChair online submission and 
reviewing system. The software was developed by Richard van de Stadt, originally for 
the European Conference on Object-Oriented Programming. The software not only al- 
lowed authors to submit their abstracts and papers electronically, but provided addition- 
al interfaces for program committee members to enter and browse reviews for their pa- 
pers, and for the program chair to monitor progress of the review process. CyberChair 
was especially helpful at the two most critical times; around the paper submission dead- 
line, and in the two weeks before the program committee meeting. 

Obviously a large number of people contributed to making the conference a suc- 
cess. We would like to thank especially the program committee members for their ded- 
ication in preparing thorough reviews in time for the meeting, and for going out of their 
way to attend the program committee meeting. (It is the first time in our experience that 
every single member of a program committee of this size has made it to the review 
meeting!) We would also like to thank Isabelle Huber for the administrative support in 
Berne, Richard van de Stadt for providing and supporting CyberChair, Jean-Guy Sch- 
neider for installing and supporting CyberChair in Berne, and Francine Decavele for her 
efficient contribution to the local organization. 

A special thanks to SUP AERO which has accepted the task of hosting us for a full 
week, right at the beginning of the first term of the academic year 2000. 

We would also like to thank all the additional reviewers who evaluated submitted 
papers, sometimes on extremely short notice. Finally, we would like to thank the au- 
thors of all submitted papers for their contributions, and the authors of all accepted pa- 
pers for their help and collaboration in getting the proceedings prepared quickly and 
smoothly in time for the conference. 
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Extreme Programming: 

A Discipline of Software Development 



Kent Beck 

Daedalos Consulting, Germany 



Abstract. You can look at software development as a system with 
inputs and outputs. As with any system, software development needs 
negative feedback loops to keep it from oscillating. The negative feed- 
back loops traditionally used — separate testing groups, documentation, 
lengthy release cycles, reviews — succeed at keeping certain aspects un- 
der control, but they tend to have only long term benefits. What if we 
could find a set of negative feedback loops that kept software develop- 
ment under control, but that people wanted to do, even under stress, 
and that contributed to productivity both short and long term? 
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Components and Generative Programming 

Krzysztof Czarnecki' and Ulrich W. Eisenecker^ 

'DaimlerChrysler AG Research and Technology, Ulm, Germany 
czarnecki @ acm.org 

^University of Applied Sciences Heidelberg, Germany 
ulrich.eisenecker@t-online.de 



Abstract. This paper is about a paradigm shift from the current practice of manually searching 
for and adapting components and their manual assembly to Generative Programming, which is 
the automatic selection and assembly of components on demand. First, we argue that the 
current OO technology does not support reuse and configurability in an effective way. Then we 
show how a system family approach can aid in defining reusable components. Finally, we 
describe how automate the assembly of components based on configuration knowledge. We 
compare this paradigm shift to the introduction of interchangeable parts and automated 
assembly lines in the automobile industry. 

We also illustrate the steps necessary to develop a product line using a simple example of a car 
product line. We present the feature model of the product line, develop a layered architecture 
for it, and automate the assembly of the components using a generator. We also discuss some 
design issues, applicability of the approach, and future development. 



1 From Handcrafting to an Automated Assembly Line 

This paper is about a paradigm shift from the current practice of manually searching 
for and adapting components and their manual assembly to Generative Programming, 
which is the automatic selection and assembly of components on demand. This 
paradigm shift takes two steps. First, we need to move our focus from engineering 
single systems to engineering families of systems — this will allow us to come up with 
the “right” implementation components. Second, we can automate the assembly of the 
implementation components using generators. 

Let us explain this idea using a metaphor: Suppose that you are buying a car and 
instead of getting a read-to-use car, you get all the parts necessary to assemble the car 
yourself. Actually, not quite. Some of the parts are not a one-hundred-percent fit and 
you have to do some cutting and filing to make them fit (i.e. adapt them). This is the 
current practice in component-based software engineering. Brad Cox compares this 
situation to the one at the brink of the industrial revolution, when it took 25 years of 
unsuccessful attempts, such as Ely Whitney’s pioneering effort, until John Hall finally 
succeeded to manufacture muskets from interchangeable parts in 1822 (see [Cox90, 
Wil97]). Then it took several decades before this groundbreaking idea of mass- 
manufacturing from interchangeable parts spread to other sectors. 

Even if you use a library of designed-to-fit, elementary components (such as the C-H- 
Standard Template Library [MS96]), you still have to assemble them manually and 
there is a lot of detail to care about. In other words, even if you don’t have to do the 

O. Nierstrasz, M. Lemoine (Eds.): ESEC/FSE '99, LNCS 1687, pp. 2-19, 1999. 

© Springer-Verlag Berlin Heidelberg 1999 




Components and Generative Programming 3 

cutting and filing, you still have to assemble your car yourself (that’s the “lego 
principle”). 

Surely you rather want to be able to order your car by describing it in abstract terms, 
saying only as much as you really care to, e.g. “get me a Mercedes-Benz S-Class with 
all the extras” or “a C-Class customized for racing, with a high-performance V8 
engine, four-wheel vented disc brakes, and a roll cage”, and get the ready-to-drive car. 
And that’s what Generative Programming means for the application programmer: The 
programmer states what he wants in abstract terms and the generator produces the 
desired system or component. 

This magic works if you (1) design the implementation components to fit a common 
product-line architecture, (2) model the configuration knowledge stating how to 
translate abstract requirements into specific constellations of components, and (3) 
implement the configuration knowledge using generators. This is similar to what 
happened in car building: the principle of interchangeable parts was the prerequisite 
for the introduction of the assembly line by Ransome Olds in 1901, which was further 
refined and popularized by Henry Ford in 1913, and finally automated using industrial 
robots in the early 1980s. ^ 

The rest of this paper is structured as follows. We first explain the necessary transition 
from one-of-a-kind development to the system family approach (Sections 2-3). Then, 
we describe the idea of automating component assembly based on configuration 
knowledge (Section 3). Next, we will demonstrate the previous two steps using a 
simple example: We will first come up with the components for a product line 
(Sections 5. 1-5.4) and then develop the generator for their automatic assembly 
(Sections 5. 5-5. 6). Finally, we’ll give you some real-world examples (Section 6), 
describe the idea of active libraries (Section 7), and present the conclusions (Section 
8 ). 



' The world’s first industrial robot was installed in 1961 at a General Motors factory in New 
Jersey, USA, but it was the advance of the microchip in the 1970s that made possible the 
enormous advances in robotics. 

^ Some people think that the main purpose of an assembly line is to produce masses of the same 
good, which in software corresponds to copying CDs. Nothing could be further from the truth. 
For example, the Mercedes-Benz assembly line in Sindelfingen, Germany, produces hundreds 
of thousands of variants of the C-, E-, and S-Class (there are about 8000 cockpit variants and 
10000 variants of seats alone for the E-Class). There are almost no two equal cars rolling from 
the assembly line the same day. That means that the right kind of engine or any other 
component has to be available at the right place and time at the assembly line. Furthermore, the 
suppliers have to provide the right parts at the right time (to minimize storage costs). And the 
whole process starts when different people order different cars at car dealerships. The 
fulfillment of the customer orders requires an enormous logistic and organizational effort 
involving a lot of configuration knowledge (in fact, they use product configurators based on 
configuration rules). Thus, the analogy between the automobile industry and building 
customized software solutions using a gallery of standard components (e.g. SAP’s R3 modules) 
is not that far-fetched after all. 
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2 “One-of-a-Kind” Development 

Most OOA/D methods focus on developing single systems rather than families of 
systems. Therefore, they do not adequately support software reuse. More specifically, 
they have the following deficiencies [CE99a]: ^ 

• No distinction between engineering for reuse and engineering with reuse: Taking 
reuse into account requires splitting the OO software engineering process into 
engineering for reuse and engineering with reuse. The scope of engineering for 
reuse is not a single system but a system family. This enables the production of 
reusable components. The process of engineering with reuse has to be designed 
to take advantage of the reusable assets produced during engineering for reuse. 
Current OOA/D processes lack any of these properties. 

• No domain scoping phase: Since OOA/D methods focus on engineering single 
systems, they lack a domain scoping phase, where the target class of systems is 
selected. Also, OOA/D focuses on satisfying “the customer” of a single system 
rather than analyzing and satisfying the stakeholders (including potential 
customers) of a class of systems. 

• No differentiation between modeling variability within one application and 
between several applications: Current OO notations make no distinction between 
intra-application variability, e.g. variability of objects over time and the use of 
different variants of an object at different locations in an application, and 
variability between applications, i.e. variability across different applications for 
different users and usage contexts. Furthermore, OO implementation mechanisms 
for implementing intra-application variability (e.g. dynamic polymorphism) are 
also used for inter-application variability. This results in “fat” components or 
frameworks ending up in “fat” applications. 

• No implementation-independent means of variability modeling: Furthermore, 
current OO notations do not support variability modeling in an implementation- 
independent way, i.e. the moment you draw a UMF class diagram, you have to 
decide whether to use inheritance, aggregation, class parameterization, or some 
other implementation mechanism to represent a given variation point. 

Patterns and frameworks represent an extremely valuable contribution of the OO 
technology to software reuse. However, they still need to be accompanied by a 
systematic approach to engineering for and with reuse. 

3 System Family Approach 

In order to come up with reusable components, we have to move our focus from single 
systems to system families."* The first thing to do is to distinguish between the 



^ As of writing, OOram [Ree96] is the only OOA/D method known to the authors which truly 
recognizes the need for a specialized engineering process for reuse. 

■* Parnas defined a system family as follows [Par76, p. 1]: “We consider a set of programs to 
constitute a family, whenever it is worthwhile to study programs from the set by first studying 
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development /or reuse and with reuse. In the software reuse community, development 
for reuse is referred to as Domain Engineering Development with reuse, on the 
other hand, is referred to as Application Engineering. Let us take a closer look at the 
steps of Domain Engineering: 

• Domain Analysis: Domain Analysis involves domain scoping and feature 
modeling. Domain scoping determines which systems and features belong to the 
domain and which not. This process is driven not only by technical but also 
marketing and economic aspects (i.e. there is an economic analysis as in the case 
of any investment) and involves all the stakeholders of the domain. For this 
reason, the resulting domain is often referred to as a product line. Feature 
modeling identifies the common and variable features of the domain concepts and 
the dependencies between the variable features. Refining the semantic contents of 
the features usually requires several other modeling techniques such as modeling 
relationships and interactions between objects (e.g. using UML). 

• Domain Design: The purpose of domain design is to develop a common 
architecture for the system family. 

• Domain Implementation: Finally, we need to implement the components, 
generators, and the reuse infrastructure (dissemination, feedback loop from 
application engineering, quality control, etc.). 

There are many Domain Engineering methods available, e.g. FODA [KCH+90] and 
ODM [SCK+96] (e.g. see surveys in [Cza98, Arr94]). However, most of the methods 
incorporate the above-listed steps in some form. Examples of methods combining 
Domain Engineering and OO concepts are RSEB [JGJ98 and GFA98], DEMRAL 
[Cza98], and the work presented in [CN98]. 

4 Problem vs. Solution Space and Configuration Knowledge 

Once we have the “right” components, the next step is to provide means of mapping 
abstract requirements onto appropriate configurations of components, i.e. automate 
the component assembly. The key to this automation is the configuration knowledge, 
which maps between the problem space and the solution space (Fig. 1). 

The solution space consists of the implementation components with all their possible 
combinations. The implementation components are designed to maximize their 
combinability (i.e. you should be able to combine them in as many ways as possible), 
minimize redundancy (i.e. minimize code duplication), and maximize reuse.® The 



the common properties of the set and then determining the special properties of the individual 
family members.” 

^ In the software reuse community, a domain is defined as a well-scoped family of existing and 
potential systems including the expertise required to build these systems. So, in our context, we 
can use the terms “domain” and “system family” interchangeably. 

® These are also the properties usually required from generic components. Thus, the principles 
of Generic Programming apply to the solution space. An even higher level of genericity of the 
implementation components can be achieved using Aspect-Oriented Programming [KLM+97]. 
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problem space, on the other hand, consists of the application-oriented concepts and 
features that application programmers would like to use to express their needs. The 
configuration knowledge consists of illegal feature combinations (certain 
combinations of features may be not allowed), default settings (if the application does 
not specify certain features, some reasonable defaults are assumed), default 
dependencies (some defaults may be computed based on some other features), and 
construction rules (combinations of certain features translate into certain combinations 
of implementation components). 




confiourotion knowledge 

• illegal feature combinations 

• default settings 

• default dependencies 

• construction rules 

• optimizations 




Fig. 1. Problem and solution space 

This kind of separation between problem and solution space and the configuration 
knowledge allow us to enjoy the same convenience in requesting concrete systems or 
components as in ordering cars: You don’t have to enumerate all the concrete parts, 
e.g. suspension, carburetor, battery, and all the nuts and bolts, but rather specify the 
class (e.g. C- or E-Class), the line (e.g. Classic-, Elegance- or SportLine), and the 
options (e.g. side airbags, trailer coupling, air conditioning, etc.) and get a finished 
car. Please note that the features in the problem space may be inherently abstract, e.g. 
SportLine. In other words, there is no single component that makes a car to be a sports 
car, but it is rather a particular combination of carefully selected parts that achieve this 
quality (in computer science, think of the features “optimized for speed” or “optimized 
for space”). This also makes Generative Programming different from Generic 
Programming: While Generative Programming allows you to specify abstract features, 
the parameters you supply in Generic Programming represent concrete components. 

It is important that you can order a car by specifying only as much as you want. For 
example, there are people who have a lot of money and no time. They could say “give 
me a Mercedes S-class with all the extras”. On the other hand, there could be a 
customer interested in car racing, who needs to be very specific. He might want to 
specify details about particular car parts, e.g. “an aluminum block with cast-in Nikasil 
sleeves and twin-spark plug, three-valve cylinder heads”. 

The same spectrum of specificity should be supported by a library of reusable 
components [KLL+97]. As an application programmer, when you request a 
component, you should be able to specify only as much as necessary. You should not 
he forced to specify too much detail: this would make you unnecessarily dependent on 
the implementation of the library. However, if necessary, you should be able to 
specify details, or even supply your own implementations for some aspects. Defaults, 
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default dependencies, and hard constraints (i.e. illegal feature combinations) make this 
kind of flexibility possible. 

Finally, the configuration knowledge is implemented using generators. Depending on 
the complexity of the configuration space, the configuration process may be an 
algorithmic one (for simple configuration spaces) or search-based (for more complex 
configuration spaces). 

5 Example: Designing a Product Line 

Now we’ll illustrate the steps necessary to develop a product line and automate the 
component assembly. We will use a pedagogical toy example: a simple C++ model of 
a car. Real applications of these techniques are described in Section 6. 

5.1 Domain Analysis 

Domain analysis involves domain scoping and feature analysis of the concepts in the 
domain. Feature analysis allows you to discover the commonalities and the 
variabilities in a domain. Reusable models usually contain large amounts of 
variability. For example, a bank account can have different types (savings, checking, 
or investment), different kinds of ownership (personal or business), different 
currencies, different bank statement periods, different interest rates, service fees, 
overdraft policies, etc. 

But let’s start with the domain analysis for our car product line. Suppose that based on 
our market studies, we decided that the cars we are going to produce will provide the 
following features: automatic or manual transmission, electric or gasoline or hybrid 
engine, and an optional trailer coupling. 

The results of feature analysis can be documented Msing feature diagrams [KCFI+90], 
which reveal the kinds of variability contained in the design space. The feature 
diagram for our car is shown in Fig. 2. The root of the diagram represents the concept 
car. The remaining nodes are features. The diagram contains four kinds of features:^ 

• Mandatory features'. Mandatory features are pointed to by simple edges ending 
with a filled circle. The features car body, transmission, and engine are 
mandatory and thus part of any car. 

• Optional features'. Optional features are pointed to by simple edges ending with 
an empty circle, e.g. trailer coupling. A car may have a trailer coupling or not. 

• Alternative features: Alternative features are pointed to by edges connected by an 
arc, e.g. automatic and manual. Thus, a car may have an automatic or manual 
transmission. 

• Or-features: Or-features are pointed to by edges connected by a filled arc, e.g. 
electric and gasoline. Thus, a car may have an electric engine, a gasoline engine, 
or both. 



’ Or-features are not part of the notation in [KCH+90]. They were introduced in [Cza98]. See 
[Cza98] for a full description of the feature diagram notation used here. 
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Fig. 2. A sample feature diagram of a car 

The diagram in Fig. 2 describes twelve different car variants (two different 
transmissions, three kinds of engine, and an optional trailer coupling, i.e. 2-3-2=12). 
Constraints that cannot be expressed in a feature diagram have to be recorded 
separately. For example, let’s assume that an electric or hybrid engine requires an 
automatic transmission. With this constraint, we have just eight valid feature 
combinations left. The feature diagram, the constraints, and other information (e.g. 
binding times, descriptions, etc.) constitute a feature model. Modeling the semantic 
contents of the features usually requires other kinds of diagrams (e.g. object diagrams 
or interaction diagrams). 

The important property of a feature diagram is that it allows us to model variability 
without having to commit to a particular implementation mechanism such as 
inheritance, aggregation, templates, or #ifdef -directives. 

5.2 Domain Design 

Now we need to come up with the architecture for the product line. This involves 
answering questions such as what kinds of components are needed, how they will be 
connected, what kind of middleware or component model will be used, what interfaces 
the component categories will have, how they will accommodate the requirements, etc. 
Designing the architecture is an iterative process and it usually requires prototyping. 
Studying existing architecture styles and patterns greatly helps in this process (e.g. see 
[BMR+96]). 

For our car example, we’ll use a particular kind of a layered architecture, called a 
GenVoca architecture (see [B092, SB98] and also [ML98]). This kind of architecture 
proved to be useful for a wide variety of systems (see Section 6). In our experience, 
designing a GenVoca architecture requires the following steps: 

Identify the main functionalities in the feature diagrams from the Domain 
Analysis. The main functionalities for our sample car are car body, transmission, 
engine, and trailer coupling. 
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Enumerate component categories and components per category. The component 
categories correspond to the four main functionalities listed above. The component 
categories and the components per category for our sample car are shown in Fig. 3. 



CarBody: Transmission; Engine; Trailercoupling; 



CarBody 




AutomaticTransmission 




GasolineEngine 




TrailerCoupling 




ManualTransmission 


ElectricEngine 





HybridEngine 



Fig. 3. Component categories for the car product line 



Identify “uses” dependencies between component categories. For our example, 
let’s assume that CarBody uses Engine (1) and Transmission (2), 
Transmission uses Engine (3) and Trailercoupling uses CarBody (4) 
(see Fig. 4 a). 





b) 



> 



Car 



Trailercoupling 

CarBody 

Automatic I Manual 
Gasoline | Electric | Hybrid 
ConfigurationRepository 
c) 



Fig. 4. Derivation of a layered architecture 

Sort the categories into a layered architecture. The component categories can be 
arranged into a hierarchy of layers, where each layer represents a category and the 
categories that most other categories depend on are moved towards the bottom of the 
hierarchy (see Fig. 4 b). Finally, we add two more layers: Car on the top and 
Conf igurationRepository at the bottom (see Fig. 4 c). Car is the top layer 
identifying all cars. Conf igurationRepository is used to communicate 
configuration information to all layers. This is possible since a layer may retrieve 
information from any layer below it. You’ll see how this works in Section 5.3. The 
dashed box around Trailercoupling indicates that this layer is optional. The 
Transmission and the Engine layer display the alternative components separated 
by vertical bars. 
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Write down the GenVoca granunar. The main idea of the layered architecture in 
Fig. 4 c is that a component from a given layer takes another component from the 
layer just below it as a parameter, e.g. CarBody may take AutomaticTrans- 
mission or ManualTransmission as a parameter, i.e. we may have CarBody [ 
AutomaticTransmission [ . . . ] ] or CarBody [ManualTransmission [ 
...]]. Using this idea, we can represent this layered architecture as a set of grammar 
rules (see Fig. 5). Please note that vertical bars separate alternatives and we have 
abbreviated Conf igurationRepository to Config. 

Car: Car [CarBodyWithOptTC] 

CarBodyWithOptTC: CarBodyWithTC [CompleteCarBody] | CompleteCarBody 

CompleteCarBody : CarBody [TransmissionWithEngine] 

TransmissionWithEngine : ManualTransmission [Engine] | 

AutomaticTransmission [Engine] 

Engine: GasolineEngine [Conf ig] | ElectricEngine [Conf ig | 

HybridEngine [Config] 

Config: speeds. Engine, Transmission, CarBody, Car 

Fig. 5. GenVoca grammar for the car product line 

At this point the architecture for our car product line is finished. Of course, the steps 
require an iterative process and prototyping to come up with the final set of 
components. The latter is also needed to find a stable interface for each of the 
component categories (each component in a category is required to implement the 
category interface). 

5.3 Implementation Components 

Once we have the architecture, we can implement the components. As stated, a 
component from a given layer takes a component from the layer below it as a 
parameter, i.e. we need to implement the components as parameterized components. 
In C++, we can use class templates for this purpose. For example, 
GasolineEngine can be implemented as follows: 

template<class Config_> 
struct GasolineEngine 

{ typedef Config_ Config; //publish Config as a member type 
GasolineEngine ( ) { cout << "GasolineEngine " ; } 

}; 

GasolineEngine takes Conf ig_ as its parameter and publishes it under the new 
name Config. Any other component that takes GasolineEngine as its parameter 
can retrieve Config from it using the C++ scope operator i.e. 

GasolineEngine :: Conf ig. ElectricEngine and HybridEngine are 
implemented in a similar way and are not shown here. 

The next component is ManualTransmission. According to the grammar in Fig. 
5, it takes an Engine as its parameter: 

template <class Engine> 
struct ManualTransmission 

{ typedef typename Engine :: Conf ig Config; 
enum { speeds = Config: : speeds }; 

Engine e ; 

ManualTransmission 0 { cout << speeds << " -SpeedManualTransmission "; } 
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ManualTransmission retrieves Config from its parameter Engine and 
publishes it under the alias Config. This is how Config, which is the bottom layer 
in the hierarchy, is propagated all the way up to the top layer. The line below the 
typedef-declaration shows you how Config is used. In this example, 
ManualTransmission retrieves the number of speeds from the Config (e.g. 4 or 
5). The remaining components AutomaticTransmission, CarBody, and 
CarBodyWithTC (TC stands for trailer coupling) are implemented in a similar way. 
Let us just take a look at the last component, namely Car: 

template <class CarBody_> 
struct Car 

{ typedef typename CarBody_: : Config Config; //Config is part of any car! 
CarBody_ cb; 

CarO { cout << "Car " << endl << endl; } 

}; 



5.4 Manual Assembly 

Now that we have the implementation components, we can build different cars by 
writing down different configuration repositories, in which the appropriate 
components are assembled together. For example, the following configuration 
repository defines a car with a gasoline engine, five-speed manual transmission, and 
without a trailer coupling: 

Struct Configl 
{ enum { speeds = 5 } ; 

typedef GasolineEngine<Conf igl> Engine; 
typedef ManualTransmission<Engine> Transmission; 
typedef CarBody<Transmission> CarBody; 
typedef Car<CarBody> Car; 

}; 

And a car with an electric engine, four-speed automatic transmission, and without a 
trailer coupling looks as follows: 

struct Config2 
{ enum { speeds = 4 } ; 

typedef ElectricEngine<Conf ig2> Engine; 
typedef AutomaticTransmission<Engine> Transmission; 
typedef CarBody<Transmission> CarBody; 
typedef Car<CarBody> Car; 

}; 

You can declare an instance of the latter car as follows: 

Config2::Car c2; 

Writing configuration repositories is a tedious exercise. The person writing them 
needs to know what implementation components are available, what are the valid 
configurations (e.g. an electric or hybrid engine requires an automatic transmission), 
and which configurations are more optimal or satisfy some other requirement. Thus, 
requiring the application programmer to write a configuration repository places quite a 
burden on her. Even worse, it makes client code too strongly coupled with the 
architecture and the implementation components since changes to the architecture 
(e.g. adding a new layer) may require modifying all configuration repositories. 

An alternative would be to include all possible configuration repositories in the 
library. However, this is usually not practicable since there is normally a large number 
of configurations (e.g. the matrix computation library described in Section 6 would 
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require 1840 configuration repositories) and each of them is usually longer than in the 
car example. The solution is to generate the configuration repositories out of more 
abstract descriptions. But before taking a look at how this works, we need to figure 
out what an “abstract description’’ means in our context, i.e. how to conveniently order 
cars. 



5.5 Ordering Cars 



When you order a car, you don’t have to describe to the car dealer from which parts to 
assemble it. Instead, you specify the class, the line, and the options, which are usually 
listed in a product information brochure. An example of such a brochure for our car 
product line is shown in Fig. 6. The brochure specifies features available as standard 
equipment or as options for each line. Please note that according to the brochure, it is 
not possible to order an electric or hybrid engine together with a manual transmission. 




Engine 


gasoline 


• 






electric 




• 




hybrid 






• 


Transmission 


five-speed manual 


• 






four-speed automatic 


o 


• 


• 


Options 


trailer coupling 


olo 


O 



• standard equipment 
O options availabie for a surcharge 



Fig. 6. Product information brochure for our car product line 

We can easily implement this brochure in C++. Since a client specifies a car by stating 
the desired line and options, we need to provide the vocabulary representing the lines 
(BaseLine, City Line, and EcoLine) and options (transmission and trailer coupling). 
The enumeration type Transmission will be used to specify the transmission and 
Options will be used to state whether a trailer coupling is available or not: 

enum Transmission { fiveSpeed, fourSpeedAutomatic } ; 
enum Options { none, trailercoupling }; 

Now we need the vocabulary representing the three lines. We will model this 
vocabulary using types rather than enumeration constants since they will contain a 
member indicating the required transmission: 

template <int transmission_ = fiveSpeed> 
struct BaseLine 

{ enum { transmission = transmission_, 
line = baseline 

}; 

}; 

BaseLine is available with either manual transmission (standard) or automatic 
transmission (option). Therefore, we have the template parameter transmission 
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with five-speed manual transmission as default. The second member is just a line 
identifier, so that we can verify at compile time whether a line type is BaseLine or 
not based on the value of BaseLine : : line. Here is the declaration for baseline 
and the two remaining line identifiers: 

enum Line { baseline, Cityline, ecoline } ; //for internal use only 
Finally, we have the two remaining lines CityLine and EcoLine: 
struct CityLine 

{ enum { transmission = f ourSpeedAutomatic , 
line = Cityline 



struct EcoLine 

{ enum { transmission = f ourSpeedAutomatic , 
line = ecoline 

}; 

}; 

Please note that whenever you select a CityLine or an EcoLine, you automatically 
select a four-speed automatic transmission. Thus, the vocabulary for ordering cars is 
designed to exclude the possibility of specifying an illegal feature combination (e.g. 
electric engine and manual transmission). In general, when you design a domain- 
specific language for “ordering” different systems or components, you may either 
prevent illegal feature combinations by structuring the language as in the car example 
or by having an extra “buildability checking” step in the generator. The first option is 
preferred if the configuration space is highly irregular, i.e. there are many illegal 
combinations compared to the total number of combinations (i.e. the probability of a 
mistake is high). The second option is better if there are only few illegal combinations. 
In this case, the language is simpler and the few possible mistakes are caught by the 
generator. For example, if the constraint was that a hybrid engine requires an 
automatic transmission, we could specify a car using three enumeration types 
Engine, Transmission, and Options. The generator would be responsible for 
detecting the illegal combination of a hybrid engine and a manual transmission. 

Another issue in the design of a domain-specific language is the level of specificity it 
supports. Ideally, it should support a spectrum from being unspecific (e.g. “give me a 
car”) to being able to specify details about the implementation components, or even 
providing user-defined components in the specification. The trade-offs here include 
the required level of detail, the stability of the architecture, and the complexity of the 
configuration space. In any case, supporting different levels of specificity requires 
defining default settings and default computation rules. 

5.6 The Generator 

The generator takes a specification of a system or component and returns the finished 
system or component. 
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Fig. 7. Stages of a configuration generator 

In general, a configuration generator [CE99a] performs the following steps (see Fig. 
7): it checks if the specified system can be built, completes the specification (by 
computing defaults), and assembles the implementation components. In our car 
example, however, there is no buildability checking (since the domain-specific 
language doesn’t give an opportunity to specify illegal feature combinations) and there 
are no computed defaults (as you’ll see in a moment, the few direct default settings are 
specified in the parameter list of the generator). 

Of course, we could implement the generator as a pre-processor generating 
configuration repositories in C-H- source code. A better alternative, however, is to use 
the built-in metaprogramming capabilities of C-H-, i.e. template metaprogramming. 
Without explaining all the detail, let us just state that C++ templates constitute a 
compile-time, Turing-complete sublanguage of C++. In other words, you can use the 
template instantiation process to perform arbitrary computations at compile time (this 
was first observed by Erwin Unmh [Unr94]). Meanwhile, there is a whole set of 
programming idioms and principles based on this idea, which are collectively referred 
to as “template metaprogramming” [Vel95, CE99b]. For the purpose of this article, 
you only need to understand that the generator is implemented as a template taking the 
description of a car as its parameter and returning the finished car type in a specially 
designated member, which is by convention called RET (which stands for RETURN). 
To make this discussion more concrete, let’s take a look at how you would create an 
instance of a BaseLine car with a four- speed automatic transmission and a trailer 
coupling: 

CAR_GENERATOR<BaseLine<f ourSpeedAutomatio , trailerCoupling> : : RET carl ; 

or an EcoLine car with a four-speed automatic transmission and without a trailer 
coupling: 

CAR_GENERATOR<EcoLine> : : RET car2 ; 

The implementation of CAR_GENERATOR is given in Fig. 8. It uses the templates IF 
and SWITCH, which correspond to the familiar selection statements if and switch 
(their implementation is described in [CE99b, CE98]). 
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template <class Line = BaseLineo, int options_ = none> 
struct CAR_GENERATOR 

{ typedef CAR_GENERATOR<Line,options_> Generator; 

//parse the car spec 
enum { line_ = Line::line, 

transmission_ = Line :: transmission 

}; 

// assembly components 

enum { speeds_ = (transmission_ == fiveSpeed) ? 5 : 4 }; 
typedef SWITCH<line_, 

CASE<baseline, GasolineEngine<Generator> , //see footnote® 
CASE<cityline , ElectricEngine<Generator> , 

CASE<ecoline , HybridEngine<Generator> 

> > > >::RET Engine_; 
typedef IF< (transmission_ == fiveSpeed) , 
ManualTransmission<Engine_> , 
AutomaticTransmission<Engine_> 

>::RET Transmission_; 
typedef IF< (options_ == trailercoupling) , 

CarBodyWithTC<CarBody<Transmission_> > , 
CarBody<Transmission_> 

>::RET CarBody_; 

typedef Car<CarBody_> RET; //return the finished car! 

//provide the Config® 
struct Config 
{ typedef Engine_ Engine; 
typedef CarBody_ CarBody; 
enum { speeds = speeds_, 

line - line_, 

transmission = transmission_, 

options = options_, 

}; 

typedef RET Car; 

}; 

}; 



Fig. 8. C++ implementation of the car generator 

The advantage of the implementation using template metaprogramming is that the 
generator can be used simply as any other template and we don’t need any extra pre- 
processors. The interesting aspect of this kind of metaprogramming is that the 
metacode performing the configuration at compile time is part of the library of 
domain-specific concepts as any other code implementing the concepts (e.g. the 
implementation components). 



“ Since we pass Generator to each engine instead of Config, the engine implementation 
components GasolineEngine, ElectricEngine, and HybridEngine need to be 
slightly modified to retrieve the Config from their Generator parameter. 

* Please note that Config becomes part of any generated car type and can be later accessed by 
other generators, e.g. generators for algorithms displaying cars. 
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6 Applications 

We’ve selected the car example for this paper for pedagogical reasons. You’ll find 
some more computer-oriented examples in [CE99a] (a list container) and in [EC99] (a 
bank account). We have used the techniques presented here in real-scale systems: 

Generative Matrix Computation Library (GMCL) [GMCL, Cza98, Neu98] is a 
generative C++ library for matrix computations. It contains a configuration generator 
for generating different kinds of matrix types (different element types, density, storage 
formats, memory allocation, and error checking) and another kind of generator for 
generating optimized implementations of matrix expressions, e.g. (A+B) * (C+D). 
GMCL comprises 7500 lines of C++ code and is capable of generating 1840 different, 
highly-efficient matrix types. 

Generative Matrix Factorization Library [Kna98] contains a configuration 
generator synthesizing different instances of the LU factorization algorithm family 
(e.g. Gauss, Cholesky, LDL^) with different pivoting strategies (e.g. partial, full, 
symmetric, diagonal) and for different matrix shapes. 

Generative Library for Statistics in Postal Automation [OSVA99] contains 
configuration generators for different kinds of counters, timers, and statistic 
algorithms. It is being used by Siemens Electrocom, a world leader in postal 
automation. 

7 Active Libraries 

As you saw, the implementation of the generator required metaprogramming. We have 
used template metaprogramming for this purpose. This was OK, but not perfect due to 
debugging problems (there is no debugger for the C++ compilation process!) and long 
compilation times (C++ compilers are not optimized for that kind of strange template 
programming!). Nevertheless, we were able to demonstrate the idea of putting 
metacode into domain-specific libraries. In a sense, you can think of the metacode 
performing compile-time configuration and optimization as extending the C++ 
compiler. In general, you would also like to have other kinds of metacode extending 
just about any aspect of a programming environment. This brings us to the idea of 
active libraries [CEG+98], which “are not passive collections of routines or objects, 
as are traditional libraries, but take an active role in generating code. Active libraries 
provide abstractions and can optimize those abstractions themselves. They may 
generate components, specialize algorithms, optimize code, automatically configure 
and tune themselves for a target machine, and check source code for correctness. They 
may also describe themselves to tools such as profilers and debuggers in an intelligible 
way.” An example of a system supporting this idea is Intentional Programming (IP), 
which is being developed at Microsoft Research [Sim96]. IP is an extendible 
programming environment which lets you contribute metacode to extend any aspect of 
the system including the debugger, the compilation system, and the source editing and 
display system. It is important to note that the program source IP operates on is not 
text but an active object structure. This way, all kinds of domain-specific notations are 
possible (both textual and graphical) and the programmer can actually interact with 
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his program code while coding. The wide availability of such extendible programming 
environments will bring a wholly new dimension to Generative Programming. 

8 Conclusions 

A component is always a part of a well-defined production process. For example, a 
brick is a component in the process of building houses not cars. Often cited criteria 
such as binary format, interoperability, language independence, etc., are always 
relative to the production process. For example, if you need containers in C++, STL 
components are just fine. If you need to build GUI windows, you need visual 
components (e.g. a JavaBean). If you need language-independent, distributed 
components, you may use CORBA. 

Just as it took several decades for the idea of interchangeable parts to be widely used 
in manufacturing, the transition to interchangeable software components will not 
happen instantly. In particular, there is a cultural change required on the part of 
customers, consultants, and vendors to accept solutions based on standard 
componentry rather than “artistic” individual solutions. The introduction of 
interchangeable software components requires product-line architectures to be in 
place. Only that way it is possible to easily and quickly say whether a component 
offers what a given system expects or not. Thus, we’ll need more architectural 
standardization in different industries before the idea of software components truly 
takes off. 

If you can assemble your components manually, you can also automate the assembly 
process using a generator. Automation is a logical step once you have a plug-and-play 
architecture in place. However, just as there is an additional cost to developing 
reusable software rather than just single systems, there is an additional cost to 
automation. The availability of standard architectures and components and industrial- 
quality metaprogramming environments based on the idea of active libraries will help 
reaching the break-even point more quickly. 

One final note: Generation can be performed both statically and dynamically, so your 
metaprogramming environment should allow you to execute metacode at different 
times ! 

Note: The complete source code for the car example is available at http://nero.prakinf. tu- 
ilmenau. de/~czarn/esec99 
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Abstract. We present a communication and component model for push systems. 
Surprisingly, despite the widespread use of many push services on the Internet, 
no such models exist. Our communication model contrasts push systems with 
client-server and event-based systems. Our component model provides a basis for 
comparison and evaluation of different push systems and their design alternatives. 
We compare several prominent push systems using our component model. The 
component model consists of producers and consumers, broadcasters and chan- 
nels, and a transport system. We detail the concerns of each of these components. 
Finally, we discuss a number of open issues that challenge the widespread deploy- 
ment of push or any other system on an Internet-wide scale. Payment models are 
the most important among these and are not adequately addressed by any existing 
system. We briefly present the payment approach in our Minstrel project. 



1 Introduction 

The dominant paradigm of communication on the worldwide web and in most distributed 
systems is the request-reply model. In this model of distributed information systems, a 
client actively “pulls” information from the server. Even from the early days of the Inter- 
net, systems such as electronic mail and Usenet News have attempted to overcome the 
deficiencies of the pull model of communication by allowing producers of information 
to “push” their information closer to the clients. In the push model of communication, an 
information producer announces the availability of certain types of information, an inter- 
ested consumer subscribes to this information, and the producer periodically publishes 
the information (pushes it to the consumer). The pull and push models are contrasted in 
Fig. 1. 

The first push system approach was introduced in 1992 by the dynamic document 
concept of Netscape Navigator 1.1 [20]. Its basic ideas are server push and client pull. 
With server push the server sends data which is displayed by the browser but the con- 
nection between server and client remains open. Later the server may continue to send 
other pieces of data to the client. Client pull automates reloads: the server sends data 
including a Refresh directive specifying a time and a URL in the HTTP response or 
in the document header. After the given time the client loads the document specified by 
the URL. 
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tive. 



O. Nierstrasz, M. Lemoine (Eds.): ESEC/FSE’99, LNCS 1687, pp. 20-38, 1999. 
© Springer- Verlag Berlin Heidelberg 1999 




A Component and Communication Model for Push Systems 



21 



I \ Request 

Consumer I Producer 



I o 



o 



^ I Subscribe 
< ^Recelve ] 

I Unsubscribe^ 



jtiLr 



■c 



Announce | 
Publish ~| 



Fig. 1. Pull vs. push 



To show the wide applicability of the push model as a paradigm for building dis- 
tributed applications, here we outline three distinct applications whose design can be 
decomposed in terms of push concepts. 

1 . Intra-company employee information system. Many organizations have proprietary 
and ad hoc systems for keeping their employees informed about their organizational 
news. This is sometimes viewed as one of the organization’s most important and 
most difficult tasks. Such a system may be built as a standard push system. 

2. Electronic maintenance manuals. Companies that produce appliances have mainte- 
nance manuals that are carried by their maintenance workers when they are called 
to repair appliances on site. The updating of such maintenance manuals is costly 
and dealing with the paper manuals is tedious. With a push system each product line 
could be associated with one channel and each maintenance worker subscribes to 
the desired channel. 

3. Stock ticker system. This is a classic example for event-based and push models. 

The existence of such diverse applications, all of which can be designed as specific 
instances of push-based systems, argues for the inherent utility of the push model and 
concepts. In all of these systems, we can easily identify distinct producers and consumers, 
and also the necessity for a subscription phase. 

The goal of this paper is to contrast the push communication model with existing 
communication models and present a component model for push systems. The compo- 
nent model provides a basis for comparison and evaluation of different push systems 
and their design alternatives. Further, the component model may be used as a basis for 
a reference implementation and as a source of identifying some open issues. 

The paper is organized as follows; Section 2 compares the communication models 
for distributed systems and analyzes their impact on scalability, network load, and state 
maintenance. Section 3 presents our component model for push systems which is used in 
Sect. 4 for the comparison of representative push systems. Section 5 then discusses the 
relationship of push systems with mobile code and event-based systems. In Sect. 6 we 
present the main issues that push systems must address to become usable on an Internet 
scale. We summarize and give our conclusions in Sect. 7. 
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2 A Comparison of Distributed Communication Models 

A distributed system consists of several computing nodes connected by a computer net- 
work that provides for communication among the nodes. When applied in the distributed 
environment, the traditional software decomposition task must also deal with allocating 
(or mapping) software modules to the different nodes. One of the performance goals of 
distributed software design is to minimize the amount of needed communication among 
the nodes. Less communication leads to higher scalability, that is, the ability of the 
system to support more nodes or more users. 

The client-server model provides an architectural approach for organizing the soft- 
ware for distributed platforms. The model assumes a small number of servers (say, 10) 
and a moderate number of clients (say, 1000). The basic scheme is that clients interact 
with (human) users and contact the servers to ask for (computationally-intensive or data- 
intensive) services. The communication model of client-server systems may be called 
session-based (or stateful). During a session a state is shared between a client and a 
server and is modified through one or more interactions between them. 

The emergence of the Internet and its use as a platform for distributed applications, 
however, exposed the weaknesses of the session-based communication model in terms of 
scalability. Internet applications must scale to millions of nodes and users. The primary 
impediment to scalability is the participants’ need to maintain a shared state. Conse- 
quently, in the interest of scalability, the world-wide web adopts a stateless approach to 
client-server communication {web-based model). In this scheme, each interaction be- 
tween the client and the server is independent of the other interactions. No “permanent” 
connection is established between the client and the server and the server maintains 
no state information about the clients. While this scheme helps scalability, it becomes 
difficult to maintain a state; the client, the server, or both must maintain the state and 
ensure its coherence. Web-based applications scale to 1000s of servers and 1,000,000s 
of clients. Depending on the requirements of an application, the application designer 
may choose between these two models in client-server computing. 

The client-server model deals with two participants in the communication. In the 
peer-to-peer model the application is decomposed among many peer nodes as opposed to 
clients and servers. For this reason, we refer to the nodes here as producers and consumers 
rather than clients and servers. In the peer-to-peer model communication begins with 
a subscription phase in which a consumer registers its interest with a producer. At this 
point, we may divide the peer-to-peer model also into two subclasses: the event-based 
and the push-based models. 

In the event-based model, nodes are loosely-connected and behave symmetrically: 
any node may produce events and any node may consume events. This model scales to 
many producers and many consumers because there is no coupling between them [28]. 

The communication model of push-based systems, on the other hand, is tightly cou- 
pled and asymmetric: certain nodes are designated as producers and others as consumers. 
In contrast with the event-based model, push-based systems scale to fewer producers 
but more consumers. They may be viewed as a specialization of the event-based sys- 
tems with designated producers and consumers and channels to connect each producer 
with interested consumers. Dissemination in push-based systems is done on the basis 
of particular channels rather than event classes as in event-based systems. The use of 
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channels for information classification is a major distinction from event-based systems. 
Channels increase the coupling but improve the performance in certain situations. Ta- 
ble 1 contrasts the above four communication models in terms of the primary tradeoff 
between coupling and scalability. 



Table 1. Comparison of communication models 





Client-server 


Peer-to-peer 




Session-based 


Web-based 


Event-based 


Push-based 


Coupling 


tight 


loose 


very loose 


medium 


# of clients 


moderate (1000) 


high (1,000,000) 


many (100,000) 


many (100,000) 


# of servers 


few (10) 


many (100,000) 


many (100,000) 


few (100) 



The four communication models of distributed systems described above occupy four 
areas in the design space of distributed systems. Figure 2 gives a rough view of the design 
space plotted along coupling and scalability axes. 




Fig. 2. Degree of coupling vs. degree of scalability 



3 A Component Model for Push Systems 

In this section, we present a component model for push systems which we have derived 
from an analysis of existing push systems. 

In its simplest form a push system consists of producers and consumers of information 
that are connected through channels. A consumer (receiver) subscribes to a channel and 
receives any information that is sent on the channel. A producer (information source) 
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sends data through channels. Thus the connection phase of client-server systems is 
replaced by an early subscription phase. 

In practice we use a broadcaster component to separate the concerns of channel han- 
dling from the information source. A broadcaster is responsible for managing channels 
and sending information along channels. 

An information source feeds information into a broadcaster together with rules on 
how and where (which channel) to distribute this data. The broadcaster may apply filters 
to the data and then disseminate the data via channels to consumers that have subscribed 
to receive the content of certain channels. Receivers may apply filters too and accept 
content only if it passes through the filters. To provide scalability to high numbers of users 
the distribution process involves a transport system which is conceptually transparent 
for broadcasters, receivers, and channels. 

Figure 3 depicts our component model of a push system and Fig. 4 gives a sample 
collaboration (UML sequence diagram) among the components of the model. 




Fig. 3. Components of a push system 



The information source provides new data for a specific channel to the broadcaster. 
The broadcaster applies filters on the data to limit data transfers and sends the data (in 
parallel or iteratively) to a set of repeaters (for scalability reasons) for which the filters 
succeeded. The repeaters then redistribute the data to receivers. For higher scalability, 
additional levels of repeaters may be necessary. Every broadcaster can send to multiple 
channels and every receiver can receive from multiple channels. In Fig. 3 some of the 
arcs representing channels and backchannels cut through components of the transport 
system to motivate that these components are necessary for scalability purposes but are 
transparent to the channels and the dissemination process. 











A Component and Communication Model for Push Systems 



25 



Information 




Broadcaster 




Repeater 




Receiver 


Source 





















contentUpdate(channel, data) i 




1 [passed fdters] 




' * sendContent(channel, data, 




' repeaters) 






* sendContent(channel, data, ' 




receivers) ' 







Fig. 4. Content distribution via a channel 



Having decomposed a push system into the components in Fig. 3, we will now take 
a detailed look at the concerns of each component of the component model. 



3.1 Channel 

A channel is a (logical) connector between a broadcaster and a receiver. It determines 
the protocols between these components. The most important of these protocols are the 
channel access protocol and the subscription protocol. Channels provide for many-to- 
many connections among broadcasters and receivers; each broadcaster provides a set 
of channels that receivers can subscribe to; each receiver subscribes to a set of chan- 
nels. Channels are a major distinction between event-based systems and push systems. 
The channel concept of push systems already provides a coarse level of information 
classification that event-based systems usually lack. By grouping data according to the 
information type the total amount of data transfers can be easily cut down. Data (events) 
need only be distributed to channel subscribers. Additionally, finer filtering can be ap- 
plied to the contents of a channel (like in event-based systems). 

A channel determines several properties of the data to be disseminated and the sup- 
ported functionalities; 

Type of information: The focus of the data that is distributed in a channel (e.g. financial 
news, software updates). 

Data format: The formats (e.g. HTML, Java) and semantics (e.g. static, executable) of 
a channel’s data. 

Personalizing/filtering: The extent of user customization that is possible (e.g. content 
selection, operation modes, payment). 

Content expiration: Channel content can be transient or persistent. An expiration strat- 
egy for the channel (possibly down to individual data pieces) must exist to prevent using 
up the consumer’s resources. 

Update strategy: This defines how updates of the channel’s contents are done; replace- 
ment, incremental, or differential updates. The timing of updates has impact on data 
accuracy, network traffic, and scalability. 

Scheduling strategy: The main scheduling options are time-scheduled versus content- 
scheduled. Time-scheduled channels deliver “unrepeatable,” “live” content depending 
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on the access time of a channel. Content-scheduled channels deliver content “time in- 
dependently.” 

Operation mode: Consumers may not be online all the time. Support for offline/mobile 
operation with feasible synchronization protocols is necessary. 

Payment: Certain channels may involve payment (support channels, special contents, 
etc.). The channel configuration determines which payment scheme to use; pay-per-view, 
content-based, time-based, flat fee, etc. 

Some of these properties can be modeled as (producer-side and consumer-side) data- 
stream (channel) filters. Filtering can be done both at the producer and at the consumer. 

Channels model a 1 :n relationship between a producer and its consumers. Addition- 
ally backchannels may facilitate “up-stream” communication as a 1 : 1 connector between 
a consumer and a producer. A consumer can communicate information back to a broad- 
caster or information source via a backchannel. This is usually done in a client-server 
style and thus is conceptually “outside” of push systems. Backchannels can exist on a 
per-channel basis, for a set of related channels, or for the full set of channels available 
from one producer. 

Additional channel properties are given in [27]. 

3.2 Broadcaster 

A push system has at least one broadcaster component that offers channels and distributes 
channel data to the subscribers of the channels. For small-scale intranet applications one 
dedicated broadcaster may suffice. For large-scale applications that provide channels to 
thousands of subscribers it cannot be a single component. For scalability, a specialized 
broadcasting infrastructure is necessary here. 

The broadcaster itself may be distributed. A set of broadcasters may provide the 
channels and exchange updates among each other to stay in sync. The broadcaster may be 
organized according to a standard distributed data management scheme such as primary 
copy replication or data partitioning [3]. 

The primary goal is that receivers access channels from a broadcasting component 
that is “close” to them in some respect (bandwidth, delay, etc.) to minimize network 
traffic, reduce delays, and allow scalable systems. 

3.3 Receiver 

If we abstract from the transport medium, the broadcaster and receiver interact directly. 
The receiver has two main components: channel access and user interface. The receiver 
is the interface that facilitates interaction between human users and channels. It gets 
channel data from broadcasters and presents it to the user (the user could be human or an 
application). It allows the user to manipulate, control, and customize the user profile, the 
received information, and the channels. According to a channel’s defaults and the user’s 
settings the receiver is responsible for updating (received/requested) channel content, 
expiring channel data, and freeing disk space on demand. 

Finding of channels can be implemented in several ways. The receiver could query 
a channel directory or a specialized directory channel. Besides standard channels, spe- 
cialized maintenance channels can exist that fulfill functions such as maintaining and 
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updating the push system’s software components themselves, e.g., the receiver. Thus 
users would only have to do an initial setup and could use new software versions imme- 
diately; this possibility relates push systems to configuration management approaches 
like [8] and [9]. 

Push systems are related to mobile code systems since channels can distribute ex- 
ecutable code. In analogy to applets we introduce the notion of a pushlet: executable 
code and data which is intended for execution at the receiver. 

3.4 Transport System 

In a large-scale setting, a dedicated transport system is necessary to make a push system 
scalable and operational, i.e. decrease network bandwidth consumption and increase 
availability and responsiveness. 

A key design issue for the transport system is access transparency and scaling trans- 
parency towards the components and connectors described in the previous sections. 
Transparency, however, can only be achieved to a certain extent. The components of the 
transport system can be modeled by a so-called base distribution component (BDC). 
A BDC is a generic component that acts as a broadcaster towards receivers and as a 
receiver towards broadcasters. A BDC can exist in several configurations: 

Repeater. A repeater is preloaded with the channels’ contents and offers the same data 
as the broadcaster but is “closer” to the receiver. 

Cache. A cache is the same as a repeater which is loaded dynamically rather than being 
preloaded (on- demand repeater). 

Proxy. A proxy facilitates access to channels where broadcasters and receivers cannot 
communicate directly, e.g. receivers may be located behind a firewall. Every proxy has 
a domain translator sub-component that translates back and forth between the generic 
proxy functionality and the application domain functionality. 



3.5 The Notion of Broadcasting 

So far we have used the notion of broadcasting in the context of push systems only in- 
formally. Broadcasting in a large-scale push system cannot rely on a medium that offers 
broadcasting per-se (e.g. an Ethernet LAN). The scalability of a push system is in fact 
determined by the broadcasting strategy. The broadcasting strategy tries to balance the 
tradeoffs between reducing network load and reducing user response time: the broad- 
caster can reduce user response time if it pushes the data to the receiver node before 
the user accesses the data; but if the user does not access the data, network bandwidth 
has been lost. Several standard techniques may be used to implement a broadcasting 
strategy. 

Multicast. Push systems can exploit existing multicast infrastructures (e.g. MBone [12]) 
and protocols (e.g. RTP [24], NSTP [5]). This greatly simplifies the architecture and 
implementation of push systems and has several efficiency benefits. However, these re- 
sources are accessible by only a limited number of end users. 

Client pull. At regular, user-definable, intervals the receiver checks with the broadcaster 
whether the receiver’s view of the channel is still consistent or needs to be updated. 
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This pattern turns the concept of broadcasting upside-down: the initiative changes from 
producer-side to consumer-side which is an apparent contradiction to the notion of push 
systems. However, most of the available push systems actually pull at the dissemina- 
tion infrastructure level. With the client pull scheme complete data accuracy cannot be 
achieved. High data accuracy and data freshness (“immediate” notification) can only be 
achieved at the cost of high pulling frequencies which induces high network traffic and 
a possibly high number of unnecessary messages if the channel data does not change 
frequently. On the other hand, consistency requirements of channels may be rather re- 
laxed and pulling frequencies between 10 minutes and a day may suffice. Additionally, 
messages that are pulled may be rather small (some 100 bytes). Nevertheless, pulling 
is frequently used in push systems since it is robust, simple to implement, allows for 
off-line operation, and scales well to high numbers of subscribers. 

Server push. The broadcaster actively sends content to its subscribed receivers. This 
solves the freshness problem of pulling but opens new problems. The main issue that 
appears in several ways is scalability: contacting the receivers sequentially does not 
scale even for medium numbers of subscribers. It would leave receivers with different 
views of channel information depending on their ordinal number in the pushing process. 
Server push broadcasting also requires a directory of subscribers to be contacted. That 
imposes additional administration since it must be maintained, kept consistent, and is a 
single point of failure. Additionally receivers may not be online all the time. This must 
be compensated by re-broadcasts which adds considerably to the broadcaster’s load and 
complexity. 

Hybrid approaches. Hybrid approaches combine the advantages of server push (fresh- 
ness, consistency) and client pull (scalability): consumers are notified about the availabil- 
ity of new data via a push mechanism (small messages) while the client pulls to transfer 
the actual data (possibly large data). This approach is taken in the Minstrel project [19]: 
the broadcaster pushes a “sample” (description of available data, a small-size sample of 
the real data, and administrative data) to the subscribers of a channel; based on this infor- 
mation the consumers may request the actual data as a “shipment” from the broadcaster. 
A similar approach is already used by several portal sites (like Netscape’s Netcenter): 
users sign up with a mailing list and receive mails in regular intervals (push part); these 
mails typically hold a HTML document that is displayed as in a browser when read with 
an appropriate mail tool. The HTML document holds links that the user can click on and 
retrieve the corresponding document (pull part). 



4 Representative Push Systems 



Since 1996, a number of commercial systems have appeared that classify themselves as 
push systems. In this section we survey six prominent such systems. Table 2 compares 
the systems with respect to components and Table 3 classifies them in terms of the main 
features that were described in the previous sections. Providing such a comparison is 
surprisingly difficult due to the paucity of technical documentation on these systems. 
We were unable to find the answers to some of the entries in the tables. In the following, 
we briefly examine each system. 
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Table 2. Comparison of push systems based on the component model 



Push System 


Channel 


Broadcaster 


Comm. Paradigm 


Transport System 


Castanet 


V 


V 


puli 


repeater, cache, proxy 


PointCast 


V 


CBF 


pull & limited push 


cache 


BackWeb 


V 


V 


pull & push 


cache 


Webcasting 


V 


- 


pull 


- 


WebCanal 


V 


V 


push 


- 


Intermind 


V 


- 


pull 


- 



Table 3. Comparison of push systems based on features 



Push 

System 


Back- 

channel 


Pushlets 


Update 

Strategy 


Filtering 


Scalability 


Receiver 

Update 


Data 

Sec. 


Castanet 


plugin 


V 


diff. (file) 


- 


high 


y 


high 


PointCast 


- 


limited 


7 


limited 


low-medium 




- 


BackWeb 


V 


V 


diff. (byte) 


V 


medium-high 


- 


high 


Webcasting 


external 


V 


diff. (file) 


- 


high 




low 


WebCanal 


R = B 


browser-like 


diff. (file) 


- 


low-medium 


- 


- 


Intermind 


external 


browser-like 


7 


limited 


medium-high 


- 


- 



Castanet [16,17] is an advanced push system for distributing content with specific 
emphasis on installing and updating software over the Internet. In Castanet the clients 
pull at configurable intervals down to 15 minutes, or if requested by the user. It supports 
HTML channels and pushlets {presentation, applet, or application channels written in 
Java). Updates are differential, i.e. only updated files in a channel are sent to the receivers. 
A limited backchannel functionality is provided by plugins: one plugin per channel 
allows to process feedback data, e.g. return language specific data based on the user’s 
configuration. No explicit means for filtering exists but a limited degree is possible via 
user configurations. Castanet’s transport system provides for high scalability; repeaters 
{transmitters', the broadcaster is called primary transmitter), caches (called proxies), and 
proxies (called gateways) that allow channel access behind firewalls. The information 
source is modeled by the publisher software component. The Castanet receiver {tuner) 
can be automatically updated. Castanet supports two security concepts: channel signing 
guarantees the integrity and authenticity of channel data and SSL provides encrypted 
transmission. 

PointCast [2 1 ,22] is both a push system and an information provider. Only content 
coming from registered information providers can be broadcast via the Business Network. 
The Central Broadcast Facility (CBF) is the central repository for PointCast network 
information. Additionally a freely configurable intranet channel for company informa- 
tion systems and connections — content from Web servers — are supported. Channel data 
consists of Web data formats and animations written in the ScreenPlay language, which 
can be considered a limited version of pushlets. CDF [6] can be used to define (parts 
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of) channels. The default pulling interval of clients is one hour (configurable by the 
user). Push distribution is available for intranets only (through multicast). PointCast has 
no broadcaster. The administrative and channel data are retrieved from Web servers. 
Several publishing tools exist. We have not found any explicit information on the up- 
date strategy. PointCast has no backchannel concept. Limited filtering is possible: users 
can select predefined content classes and types within channels. Only a limited set of 
channels can be subscribed in parallel. A cache {cache manager) is the only transport 
system infrastructure: an organization can have a primary caching manager (CM) and 
a set of second-level CMs. If multicasting is available the CM can act as a broadcaster 
and actively distribute notifications. The scalability of PointCast is mainly limited by the 
number of available CBFs. Currently only a few CBFs exist. CBFs have limited support 
for load balancing {PointRouter and PointServer). Receiver software can be updated 
automatically. Distributed data is neither authenticated nor secured. 

BackWeb [1,2] is a highly configurable framework for information distribution. It 
comes with a rich set of supporting applications and authoring tools, including a special- 
ized authoring language — BackWeb Authoring Language Interface (BALI). It supports 
pull (HTTP and BackWeb Transfer Protocol) and push (based on StarBurst’s Multicast 
File Transfer Protocol) distribution. Pull-based channels are queried every 5 minutes 
for updates by default (configurable). Pushlets can be executable files, Java applets and 
Netscape plugins. Differential updates are supported at a byte granularity and are trans- 
parent against network disconnects. Backchannels are available by the concept of up- 
stream data which supports the building of two-way push applications. Users can filter 
channel data by type and content. Scalability ranges from medium to high depending on 
the distribution protocols and transport infrastructure used. BackWeb’s transport system 
uses chained caches {proxy servers). Receiver software has to be maintained manually. 
Certificates, digital signatures, and encryption are supported to ensure authenticated 
information and secure transmission. 

Webcasting [ 1 8] is Microsoft’ s push technology. The receiver for channels is Internet 
Explorer (IE). Three types of Webcasting exist: basic (“crawling” a Web site), managed 
(a CDF [6] description defines the downloadable data — a list of URLs, its hierarchi- 
cal structure, and an update schedule), and “true ” (integration of third party software 
for multicast or other paradigms). Without third-party products the main paradigm of 
Webcasting is pull at user-configurable intervals. Since IE is used as the receiver all 
supported Web formats can be used. Thus pushlets can be any executable content that 
IE can deal with (Java, JavaScript, ActiveX, Windows applications, etc.). The update 
strategy is differential at the granularity of files. The user can choose between monitor- 
ing content changes or downloading content changes and is notified of changes via IE 
or via email. No explicit backchannels exist, but they can be implemented “outside” by 
means of Java, ActiveX, DynamicHTML, etc. Users can choose the (parts of) channels 
they want to receive. Further filtering and personalizing can be made available by the 
channel provider. Webcasting does not have a dedicated broadcaster or a transport sys- 
tem since it fully relies on the Web infrastructure (Web servers, caches, etc.). Thus it is 
scalable to the degree the Web itself is. The receiver can be automatically updated via a 
special software update channel. Software update channels rely on OSD [26] which is a 
vocabulary to describe software components and their dependencies for deployment and 
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is submitted to the W3C to become a standard. Software packages sent via a software 
update channel can be authenticated. Other data is neither authenticated nor secured. 

WebCanal [14,15] is a platform for global information broadcast on the Internet. It 
uses multicast push distribution based on the light-weight reliable multicast protocol [13] 
(an extension of RTP [24]) and the MBone [12]. It consists of a WebCaster, a WebTuner, 
and several other tools. WebCanal can also be used as a conferencing and presentation 
platform. Since it interacts with a web browser for content display the above statements 
for Webcasting concerning pushlets, filtering and backchannels apply here too though for 
backchannels it must be adjusted since WebCanal can also be used for symmetric two- 
way communication: every receiver can act as a broadcaster. Updates are differential at a 
file granularity. WebCanal relies on the MBone as its transport infrastructure. Due to this 
its scalability depends on MBone. Receiver software cannot be updated automatically 
though WebCanal supports software distribution channels. Distributed data is neither 
authenticated nor secured. 

Intermind [27] is a pull-based push system. It has no broadcaster. Channels (ad- 
ministrative and content data) are available via Web servers. Receivers {Intermind com- 
municator) regularly check whether new content is available for a channel. A Web 
browser is used for displaying channel content. Thus pushlets are supported based on 
the executable content supported by the Web browser. Channels can be defined with 
the channel publishing tool. Such descriptions and the data are placed on a Web server 
where receivers can access it. We have not found any explicit information on the up- 
date strategy. It is also unclear how backchannels are supported. Backchannels could be 
implemented using features (Java, JavaScript, etc.) offered by the Web browser. While 
[27] defines a concept for data and meta-data exchange — channel objects based on XML 
and the Resource Description Framework (RDF) — between communication nodes, it is 
unclear to what extent this is implemented in Intermind. Limited filtering is available: 
inside a channel the user can select topics to receive from predefined per-channel top- 
ics. Additionally, channels can be categorized according to user-defined categories. Our 
comments for Webcasting regarding the transport system and the scalability apply for 
Intermind too. Receiver software cannot be updated automatically. Distributed data is 
neither authenticated nor secured. Intermind owns a patent on push-like communication 
[ 10 ]. 

5 Related Paradigms 

The diffusion of the Internet has given rise to a number of novel distributed program- 
ming paradigms. Among these, push systems, mobile code, and event-based systems 
are closely related. This section discusses the relationships between these three systems, 
their commonalities and distinguishing properties. Before elaborating on this topic we 
will first contrast electronic mail and Usenet news with push systems and event-based 
systems. A combination of electronic mail and Usenet news can be used to emulate some 
of the functionalities but fails to meet the requirements of more advanced systems. 

Mail has a 1 : 1 relationship model between producers and consumers. Even if mailing 
lists are used, 1 mail to n receivers is duplicated and transmitted n times. It is strictly 
decoupled and has very limited interaction functionalities. Usenet news has an n:m 
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relationship model. The biggest problem of Usenet news is that it is no longer scalable 
without substantial redesigns [7]. Both electronic mail and Usenet news lack integrated 
concepts for authentication, secure transmission, mobile code, administration, etc. This 
list of deficiencies is not comprehensive but motivates the need for new concepts beyond 
electronic mail and Usenet news. 

5.1 Mobile Code 

Program code used to be bound to a certain processor / computer. The intention of 
the mobile code paradigm is to have code travel around networks and computers. So- 
called mobile agents are an interesting approach for addressing information discovery, 
brokering, and scalability problems of information systems. 

Channel content in a push system can consist of executable code (pushlets) that is to 
be executed at the receiver. Thus push systems must address similar issues as pure mobile 
code systems although at a simpler scale since some of the problems in mobile code 
systems do not arise for push systems (routing of agents, protection against tampering 
of agent data, etc.). The main intersection of issues is in protecting host systems against 
malicious code and controlling access to host system resources. 

A push system can be seen as a mobile code system and vice versa; a push system 
that distributes pushlets is a special case of a mobile code system. If a push system 
distributes pushlets and every receiver in a push system is also a broadcaster to route and 
forward pushlets, then this is similar to a mobile agent system. A mobile code system, 
on the other hand, can be used to actively transport information to users and thus can 
serve as a push system. The essential difference between the two systems is in intent of 
use: push systems are data-centric, focusing on efficient dissemination of information, 
whereas mobile code systems are functionality-centric, dealing with the distribution of 
computation to reach a defined goal. 

5.2 Event-Based Systems 

Event-based systems define a new style for the construction of (distributed) applications 
based on the notion of events. In such systems, components interact by generating and 
receiving events. Components declare interest in receiving specific events and are notified 
on occurrence of those events. This pattern supports a highly flexible interaction between 
loosely-coupled components [4]. The architectural model is well-developed for local 
area networks. In a large-scale, heterogeneous setting like the Internet, however, new 
and adapted technologies are needed since many of the premises of a LAN do not 
hold at the Internet-scale. A design framework for Internet-scale event-based systems is 
presented in [23] that suggests a seven-dimensional design space. Some classifications 
of event-based approaches are given in [4]. 

Push systems and event-based systems are closely related. In fact, it is not always 
clear where to draw the dividing line. Distinctive properties exist, however. Table 4 lists 
the main differences between the two paradigms. 

The purpose of push systems is the timely distribution of data to consumers whereas 
event-based systems focus on notification of events. The roles of participants differ con- 
siderably: push systems have two distinct groups — event producers (broadcasters) and 
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Table 4. Push systems vs. event-based systems 





Push Systems 


Event-based Systems 


Purpose 


timely data distribution 


event notification 


Participant roles 


asymmetric 


symmetric 


Advertisement policy 


simple advertisement (channel) 


expressive advertisement lan- 
guage 


Subscription policy 


simple subscription (channel) 


expressive subscription lan- 
guage 


Frequency of events 


low to medium 


high 


Number of events 


low to medium 


high 


Payload size 


large 


small 


Producer/consumer 

interconnection 


static channels and static pro- 
ducers 


dynamic binding to producers 


Event grouping 


channel 


event patterns 


Filtering 


reduce data transmission re- 
quirements 


reduce number of events 



event consumers (receivers) — while in event-based systems everyone can produce and 
consume events. The announcement and subscription of new information is simplified in 
push systems since they can rely on the channel concept that provides a tighter coupling 
between producers and consumers while still providing some flexibility. Event-based 
systems, on the other hand, only have a very loose coupling between producers and 
consumers and therefore must have powerful mechanisms for event selection. 

The number and frequency of events in push systems will be limited by content 
transmission rates and thus be at a moderate level. Event-based systems in contrast are 
targeted at possibly very high event-rates. Closely connected with this are the payload 
sizes: while the size of the payloads transmitted in push systems can be quite large (since 
they are information-oriented), an important design criterion in many event-based sys- 
tems is to minimize the size of events. Due to the channel concept, the interconnection 
of producers and consumers in push systems is rather static: consumers are likely to 
receive a fixed set of channels from a set of producer with little change. Though con- 
sumers are notified on new channels, for example via a meta-channel, subscription and 
unsubscription will occur infrequently once a satisfying profile of interest exists. 

Channels also provide an implicit mechanism for event grouping: a channel will 
offer “events” of a certain quality only (e.g., weather forecast channel). Eoose coupling 
in event-based systems will lead to more dynamic interconnections: event producers can 
be mobile and change frequently. Event-based systems are intended to have sophisticated 
event grouping mechanisms called event patterns. Consumers are able to group events 
by patterns, e.g. XY*Z, meaning event X followed by zero or any number of events Y 
followed by event Z, and receive a single notification on occurrence of a pattern. Though 
the usefulness of this mechanism is undoubtedly high it adds considerable complexity to 
the implementation of such services. Ordering of distributed events is a highly complex 
research area that still needs further investigation. Patterns operate on the level of events, 
while filters operate on content-specific information to select events for notification. In 
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push systems, filters help to reduce data transmissions while in event-based systems the 
goal is to cut down on the number of events. 

Our comparison shows that, despite their similarities, the differences in foci, re- 
quirements, and applied concepts, event-based and push systems are distinctly different 
models of distributed information systems. 

6 Requirements for Widespread Use 

We have presented the push paradigm as a programming or architectural model for 
distributed systems and applications. Such a treatment identifies a number of interesting 
and important issues for further investigation, among these are: system-level design 
and integration issues, business-oriented issues, and multidisciplinary issues that reach 
beyond computer science and software engineering. In this section we present the main 
issues that we believe must be addressed in order for push systems to become usable on 
a wide scale. We also believe that these issues are relevant to any architectural model to 
be used on an Internet scale. 

The key underlying design goal for any distributed system is scalability. The compo- 
nent model of Sect. 3 supports scalability by clearly separating producers from receivers 
by an intermediate transport system. We have shown a standard structure for the trans- 
port system based on caching and replication. As we have seen in Sect. 4, the model 
can be used to accurately describe the structure of existing push systems and analyze 
and compare the important design decisions in those systems. But analyzing the scala- 
bility of a push system is far from straightforward because it depends on many criteria 
and many design goals: number of broadcasters, receivers, channels; amount of data 
on channels; frequency of updates; network latency and bandwidth; and the amount of 
common subscriptions to certain channels. The component model of Sect. 3 can be used 
as a basis for developing a scalability model and reference implementations for push 
systems. We are developing such a reference implementation in the Minstrel project at 
the Technical University of Vienna. We are using the component model as an architec- 
ture for developing plug-compatible components for push systems. Preliminary analysis 
of the benefits are promising. For example, the worst case for sending a 3.5kB message 
to 10000 receivers over a typical mixed-bandwidth network is around 45 seconds while 
with standard non-optimized email this would take over 1 hour. The average delay would 
be around 13 seconds. These figures only take into account bandwidth delays since the 
processing load that contributes to the delay can only be estimated (and for Minstrel 
would be distributed among the transport infrastructure). However, this provides a good 
indication towards performance figures because currently bandwidth is the most limit- 
ing resource. Another key performance issue at the design level involves the choice of 
locations for repeaters. Sometimes one has no control over this choice but in many cases 
there is control (e.g. on private networks). 

More interestingly, on the Internet, using Internet service provider sites as repeaters 
seems to be a promising choice. But this issue, as in many other Internet-related systems, 
raises the question of payment for services. Indeed, payment methods and business 
models have to be addressed by any commercial Internet system. This implies that push 
systems must be able to integrate supporting payment models. Because of the existence 




A Component and Communication Model for Push Systems 



35 



of the subscription phase, standard solutions such as macro-payments or flat fee systems 
(e.g. monthly charge to credit card) may be used. But just as push systems completely 
reverse the pull model, they also change the traditional payment assumptions. The sender 
may be interested in charging for all the data it sends out, especially since the receiver has 
subscribed to the information explicitly, but the receiver is only interested in paying for 
what is actually read (micro-payments, pay-per-view). In Minstrel, we are developing 
standard components for payment schemes. Figure 5 gives a model for a pay-per-view 
interaction in Minstrel. 




Fig. 5. Pay-per-view in Minstrel 



Say the push vendor has offered some information the user is willing to pay for (1-2). 
Then the following steps are taken: the user issues a request for the offered information 
which includes a payment handle, for example the unique id of the offer (3). This handle 
is given to the user’s wallet which is instructed to pay (4-5). If the payment succeeds, 
the payment server sends a receipt to the wallet which in turn notifies the component 
that processes the user’s request (6a, 7). Concurrently, the push vendor is notified and 
registers the receipt (6b). Now the original user request together with the receipt is sent 
to the push vendor which checks the receipt and returns the requested data (8-9). Then 
the received data is presented to the user (10). This payment model which is composed 
around the notion of a receipt can also be applied for the implementation of other payment 
schemes, including time-based or flat fee schemes. 

Another business-related issue is security and authentication. If high-quality infor- 
mation providers want to charge users that receive data via a push system, users must be 
sure that the information they get is authentic, i.e. fresh, unmodified information from 
an identifiable source. This issue is important in typical push application domains like 
news agencies, financial information services, and other high-confidence businesses. 
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Technically this requires the availability of authentication frameworks, and certificate 
authorities and on a large scale (X.500, LDAP). For confidentiality of the data itself, 
encryption methods must be supported by the push systems. Full integration into push 
systems and “chain of trust” infrastructures — for example, you have to trust all sites 
running repeaters — still await Internet-scale deployment. Pushlets raise another security 
problem: how can executable content received from network sources be executed in a 
safe yet “useful” way, i.e. what accesses to local resources are allowed. As event-based 
systems, push systems can also facilitate software release management [25], and soft- 
ware deployment and configuration management [8,9]. Deployment and maintenance 
of software raises similar security issues as it adds another magnitude of difficulty to the 
security problems that must be considered by a push system. These problems are similar 
to the ones that must be addressed by mobile code systems [11]. 

For widespread use of push on the Internet, standard protocols will be necessary. 
In particular, protocols and interfaces for channel definition, subscription, and access 
will be needed. At the moment, the available push systems are incompatible and cannot 
interact. Thus users and information providers have to install dedicated software for each 
system, and information needs to be tailored and structured explicitly for every system 
supported. A unified framework/standard as exists for the Web is necessary to make 
push systems a successful technology. 

7 Summary and Conclusion 

Even though there are many documents on the worldwide web and in electronic maga- 
zines about push systems, these are mostly at the user and application level, with little 
systematic treatment of the design and research issues. This paper has presented push 
systems as an architectural model for distributed systems and interactions and has po- 
sitioned it with respect to client-server and event-based architectures. The subscription 
phase of the interaction model is the key to the scalability of the push model and is 
applicable to many distributed applications for which client-server computing is defi- 
cient. We have presented a component model for push systems that may be used to 
study, analyze, and contrast different implementations of push systems, and we have 
done that for six prominent push systems. Using the concepts of information source, re- 
ceiver, broadcaster, and transport system, our component model separates the issues of 
content management, channel management, scalability, and user-interface management 
into different components. Our component model may be used as a basis for a reference 
implementation of push systems. 

We have contrasted push systems with the closely related paradigm of event-based 
systems, pointed out the distinguishing features, and shown the connection with mo- 
bile code systems. We have also presented the main issues that need to be tackled by 
push systems: scalability, network traffic, security, authentication, and electronic com- 
merce. We are currently addressing all these issues in our Minstrel project [19]. The 
Minstrel project uses the component model of Sect. 3 as an architecture for developing 
plug-compatible components for push systems and to devise an open protocol suite for 
Internet-scale content distribution. Minstrel is a proof-of-concept implementation of our 
architectural model and serves as an extensible software platform for further research. 
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The main design issues of Minstrel are scalability, a hybrid broadcasting paradigm 
that supports timely notification while requiring no special multicast infrastructure, a 
distributed model for simplifying information authentication, integrated support for mi- 
cropayment e-commerce (e.g., Millicent), and support for pushlets that are executed in 
a highly configurable client security framework (subtractive security policies, security 
negotiation). Minstrel is being implemented in Java and the protocols are based on RMI. 
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Abstract. In this paper we take the extreme view that every line of 
code is potentially mobile, i.e., may be duplicated and/or moved from 
one program context to another on the same host or across the network. 
Our motivation is to gain a better understanding of the range of con- 
structs and issues facing the designer of a mobile code system, in a setting 
that is abstract and unconstrained by compilation and performance con- 
siderations traditionally associated with programming language design. 
Incidental to our study is an evaluation of the expressive power of Mobile 
Unity, a notation and proof logic for mobile computing. 



1 Introduction 

The advent of world-wide networks, the emergence of wireless communication, 
and the growing popularity of the Java language are contributing to a growing 
interest in dynamic and reconfigurable systems. Code mobility is viewed by many 
as a key element of a class of novel design strategies which no longer assume that 
all the resources needed to accomplish a task are known in advance and available 
at the start of the program execution. Know-how and resources are searched for 
across the networks and brought together to bear on a problem as needed. Often 
the program itself (or portions thereof) travels across the network in search of 
resources. While research has been done in the past on operating systems that 
provide support for process migration, mobile code languages offer a variety 
of constructs supporting the movement of code across networks. Java [5] and 
Tcl [4] derivatives support the movement of architecture-independent code that 
can be shipped across the network and interpreted at execution time. Obliq [2] 
permits the movement of code along with the reference to resources it needs to 
carry out its functions. Telescript [11] is representative of a class of languages in 
which fully encapsulated program units called agents migrate from site to site. 
Location, movement, unit of mobility, and resource access are concepts present 
in all mobile code languages. Differentiating factors have to do with the precise 
definitions assigned to these concepts and the constructs available. 
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Language design efforts are complemented by the development of formal 
models. Their main purpose is to gain a better understanding of fundamental is- 
sues facing mobile computations. Of course, such models are expected to play an 
important role in the formulation of precise semantics for mobile code languages 
and constructs, to serve as a source of inspiration for novel language constructs, 
and to uncover likely theoretical limitations. Basic differences in mathematical 
foundation, underlying philosophy, and technical objectives led to models very 
diverse in flavor. The 7r-calculus [8] is based on algebra and treats mobility as 
the ability to dynamically change structure through the passing of names of 
entities including communication channels. Several extensions have been pro- 
posed, many of which provide an explicit notion of location [1,9]. In particular, 
the ambient calculus [3] emphasizes the manipulation of and access to admin- 
istrative domains captured by a notion of scoping. Mobile Unity [6] is a state 
transition system in which the notion of location is made explicit and component 
interactions are defined by coordination constructs external to the components’ 
code. 

The work reported in this paper is closely aligned with the investigative style 
of the formal models community but directed towards identifying opportunities 
for novel mobility constructs to be used in language design. We are particularly 
interested in examining the issue of granularity of movement and in studying 
the consequences of adopting a fine-grained perspective. Simply put, wc asked 
ourselves the question: What is the smallest unit of mobility and to what extent 
can the constructs commonly encountered in mobile code languages be built 
from a given set of fine-grained elements? Proper choice of mobility operations, 
elegant and uniform semantic specification, formal verification capabilities, and 
expressive power are several issues closely tied into the answer to the basic 
question we posed. 

In the model we explore here the units of mobility are single statements and 
variable declarations. Location is defined to be a site address and units can move 
among sites, can be created dynamically, and can be cloned. Complex structures 
can be constructed by associating multiple units with a process. The process is 
the unit of execution in our model. In the simplest terms, a process is merely a 
common name that binds the units together and controls their execution status. 
All the mobility operations available for units arc also applicable to processes. In 
addition, processes have the means to share code and resources via a referencing 
mechanism limited strictly to the confines of a single site. A reference can be 
thought of as a name that allows one process to access some code or data in 
some other process. References across sites are not permitted but they survive 
movement, e.g., access is restored when the two processes meet again. As such, 
unit reference and unit containment have distinct semantics with respect to both 
scoping rules and mobility. 

Mobile Unity provides the notational and formal foundations for this study. 
The new model can be viewed to a large extent as a specialization of Mobile 
Unity. This enables us to continue to employ the coordination constructs of 
Mobile Unity and its proof logic. The result is a small set of macro definitions 
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that map the fine-grained model proposed here to the standard Mobile Unity 
notation, and a specification of the semantics of mobility constructs in terms of 
the coordination language that is at the core of Mobile Unity. 

This application of Mobile Unity is novel. Mobile Unity has been used pre- 
viously in the definition of high level transient interactions (e.g., transiently and 
transitively shared variables) in both a physical and logical mobile setting [6], 
in the formal specification and verification of Mobile IP [7], and in the specifi- 
cation and verification of mobile code paradigms (e.g., code on demand, remote 
evaluation, and mobile agents) [10]. 

The structure of the paper is the following. Section 2 contains an infor- 
mal overview of the model, Section 3 introduces the overall structure of the 
model. Section 4 gives a description of the mobility primitives of our model, 
and Section 5 defines their formal semantics. Finally, in Section 6 we draw some 
conclusions. 



2 Model Overview 

We now give an informal overview of our model. We consider a network composed 
of sites. They are the physical locations on which computations take place. Sites 
may represent physical hosts or separate logical address spaces within a host, 
e.g., an interpreter. Sites may contain units that represent code or data. A code 
unit need not contain a complete specification of a code fragment, it may even 
be a single line of code. The variables used in the code units are considered 
“placeholders” and they do not carry a value (i.e., their value is undefined). 
Units representing data contain a single variable declaration and they carry the 
actual value of the variable. The model provides a sharing mechanism between 
values of variables with the same name in code and data units, thus code can 
change values of variables in data units during execution. 

Because code and data can be split across units, we need to include some 
notion of composition and scoping. For this purpose we introduce the concept of 
process. Processes are unit containers that reside on the sites. Unlike units they 
carry an activation status — they can be active, inactive, or terminated. Processes 
define restricted scopes for the units on the sites. Units can be placed inside a 
process, i.e., in its “private space”. Such units are said to be contained by the 
process^. The scope of a unit contained by a process is the private space of that 
process, i.e., the space on which the unit is located. The binding mechanisms 
defined by the model allow sharing among variables with the same name in the 
same scope. The scope of a unit that is not contained in any process (i.e., located 
directly on the site space) is restricted to the unit itself. In Figure I. a we show 
an example. The scope of unit v contains also unit w, and vice versa, as they are 
both contained in process P, while unit u is not contained in any process and 
its content is not shared with anyone else. 



^ The model presented in this paper is kept simple by not allowing processes to contain 
other processes. We are investigating this enhancement at the present. 
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a. b. 



Fig. 1. Processes, units, and scoping rules. Solid lines represent the containment 
relation among sites, processes, and units, while dotted lines represent references 
to units. Dashed rectangles represent a common scope for units. 



Because it is often necessary to have sharing of units among processes at the 
same location (c.g., to specify the sharing of a common resource), we allow a 
process to reference a unit contained in another process at the same location. 
In such a case, the referenced unit is considered to be in the scope of both 
processes. Processes can also reference units not contained in any process (i.e., 
located directly in the site). These units can be thought of as library classes or 
resources provided by the site to all processes located there. Figure l.b shows 
an evolution of the system from Figure l.a: here unit u is referenced by process 
P, and units u, v, and w are in the same scope. Unit w is referenced by process 
Q: since units x, y, and w are in the same scope, sharing applies. Notice that 
units X and y are not in the scope of unit v. 

A process is a unit of execution in the sense that its status constrains the 
execution of the code belonging to units inside its scope. A process has an ac- 
tivation status that can be manipulated by specihe operations. The code units 
inside the scope of the process can only be executed when the process is active. 
Processes constrain the mobility of Tinits as well: the movement of a process im- 
plies the movement of all the units contained in it. Referenced units however, are 
not moved along with the process that refers to them as they are not part of its 
private space. Furthermore, the binding mechanism inhibits the access to refer- 
enced units whenever the referencing process and the referenced unit are not on 
the same site. It is important to notice, however, that references to units are not 
discarded at the time of the move; when a referenced unit and the corresponding 
process become colocated on any site the binding is re-established. 

The model also provides mechanisms to generate and duplicate components, 
to explicitly terminate processes, and to establish or sever a reference between 
a process and a unit. In the next section we present the structure of the model 
in some detail. 
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System Swapping 




Program Q(i) at A 




declare 




x: integer [] y: integer 


initially 




a; < 10 D y < 10 




assign 




s: X, y := y,x 


if fc y 


[] mi : A A + 1 




1 m 2 : A A — 1 




end 




Components 




[] i : 0 i < N 


:: Q{i).\ = location(i) 


Interactions 




Q{i).x Q(j).x 


when Q{i).X — Q{j).\ 




engage max Q{i).x,Q[j).x 


end 





Fig. 2. A simple Mobile Unity system exhibiting random movement. 



3 Overall Model Structure 

In this section we introduce our model for fine-grained mobility and examine its 
relation to the Mobile Unity notation. A Mobile Unity specihcation is com- 
posed of several programs, a Components section, and an Interactions sec- 
tion. The program is the basic unit of definition and mobility in Mobile UNITY. 
Figure 2 shows a Mobile UNITY system for reordering values of variables. Distri- 
bution of components is taken into account through the distinguished location 
variable A associated to each program. 

The declare section contains the declaration of program variables. The sym- 
bol [| acts as a separator. The initially section constrains the initial values of 
the variables. In the example of Figure 2, x and y are initialized to an arbitrary 
value less than 10. The assign section contains the program statements. In the 
example, statement s is an assignment guarded by the clause following the if. 
The two values of the variables are swapped if the value of the first one is greater 
than the other. The statements mi and m 2 account for mobility, by modifying 
non-deterministically the location of Q. 

The Components section dehnes the components existing during the life of 
the system. Mobile Unity does not allow dynamic creation of new components. 
Each Mobile Unity program contains an index (i.e., i in the example) after the 
name of the program (i.e., Q). This allows for the creation of multiple instances 
of the same program in the Components section. In Figure 2, for instance, N 
different instances of program Q are instantiated and placed at various initial 
locations by using a function location (whose details are left out), and the index 
valuc^ . 

^ The three-part notation (op quantified .variables : range :: expression) is used 
throughout the paper: the variables from quantified.variables take on all possible 
values permitted by range. If range is missing, the first colon is omitted and the do- 
main of the variables is restricted by context. Each such instantiation of the variables 
is substituted in expression producing a multiset of values to which op is applied. 
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Program p( x, Q,i) 


Program p(s,Q,i) 


declare x\ Integer 


declare x: Integer |] y. Integer 


initially a: < 10 


initially x = ^ y — 


assign skip 


assign x, y := y, x if x y 


end 


end 



Fig. 3. Two units resulting from a reinterpretation of the program shown in 
Figure 2. 



The Interactions section contains statements that provide communication 
and coordination among components. In the example, the Interaction section 
allows the sharing of values between the two variables x of different programs 
when the programs containing them arc at the same location. Only some of 
the program instances end up sharing the values of variables x, depending upon 
their initial location and their subsequent moves. The Mobile UNITY construct « 
defines transient sharing of values for as long as the when condition holds. The 
engage statement defines a common value to be assigned (atomically) to both 
variables as the when condition transitions from false to true. In this example 
the value assumed by the two variables is the maximum over their individual 
values. It is possible to specify also a disengage statement that defines the 
values the two variables would respectively be assigned to whenever the when 
predicate is no longer true. If no disengage is specified, the variables retain the 
values they had before the when condition became false, as in our example. 

A Mobile Unity computation consists of a fair interleaving of statement 
executions, including the statements present in the Interactions section. The 
sharing construct has a higher priority and is executed any time a change in the 
values of the variables involved in the sharing happens. 

Mobile Unity considers a program to be the smallest unit of mobility. In this 
paper we want to allow mobility of a variable declaration or of a line of code. For 
this purpose, we set out to reinterpret the syntax of a standard Mobile Unity 
program such that every variable declaration and every labeled statement is in- 
terpreted as a stand-alone program, henceforth called a unit. A program now 
becomes only a static unit of dehnition. Statements and declarations as well as 
processes become the units of mobility. With this interpretation, the declaration 
of X in Figure 2 corresponds to the unit p(x, Q, i) in Figure 3. The name of 
all the units is now the constant p. Each unit is indexed by its name, the name 
of the program in which it is dehned, and by its instance discriminator. This 
representation is designed to facilitate the search for units present at some lo- 
cation using the name and/or place of definition. We use a quote to distinguish 
the actual components from their names, in particular for the first two indices 
which range over hnite enumerations. This notation allows the same names to 
be present in different program contexts. Notice that a unit capturing a decla- 
ration also contains the corresponding initialization statement for the declared 
variable. This is the definition of what we call a data unit. As will be shown 
in the next section, the annotation var is used to distinguish between variables 
present in pure data units and those appearing in code units. For code units 
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(i.e., units containing statements), the first index of the unit is the label of the 
statement defined in the program. For instance, Figure 3 shows a code unit with 
name p(s, Q, i) derived from the statement labeled with s in Figure 2. The 
statement is copied in the assign section of the unit. All the variables used in 
the statement are declared and initialized as unbound, i.e., ±. This initialization 
underscores the fact that this unit contains only code, and that the variables are 
mere placeholders, i.e., do not contain real values. 

Finally, processes are needed to organize units into executable assemblies. 
Each process has an index, like a unit, in order to allow multiple instances of the 
same process. Processes can be instantiated and placed on an initial location from 
within the Components section. Since processes are dynamic components we 
attach to them a status variable cj that can assume the values active, inactive, 
and TERMINATED. Ill order to overcome the difficulty of dynamically creating 
components in Mobile Unity we assume to have a sufficiently large number of 
instances of components initially located in a sort of “ether” . We formalize this 
by saying that they reside (implicitly) at the location A = e. In this manner, 
whenever we need to duplicate or instantiate a new component we can simply 
change the location of some component in the ether from e to an actual location. 

The sharing defined in the Interactions section of Figure 2 is given by the 
designer. We introduce in our model an automatic sharing mechanism allowing 
variable sharing inside the scope of a single process. As mentioned in Section 2, 
variables with the same name in the same scope share the same value. Thus, if 
a process contains two units both declaring a variable x their values are shared 
by definition. 



4 Mobility Constructs 

The previous sections illustrated the overall structure of our model, and how it 
differs from Mobile Unity, in terms of both syntactic differences in the way a 
specification is textually laid out and semantic differences related to the units of 
execution, mobility, and definition. Central to our model is the interplay among 
the notions of execution, scoping, containment, and location. Mobility not only 
determines the set of resources that are available at a given location, but also 
allows the dynamic reconfiguration of the code and data associated with a given 
process. In this section we describe in more detail the set of constructs defined 
in our model. In the next section, we will use Mobile UNITY to give formal 
semantics to these constructs. 

In order to keep the presentation grounded in a practical example, we consider 
a mobile code version of the well-known leader election problem for a set of nodes 
networked in a ring configuration. For the sake of simplicity, our solution will 
employ a single token, whose value is updated at each node by comparing it 
with the value of the identifier of the node on which the token is located. The 
algorithm is trivial, because it is guaranteed to find the leader in exactly one 
round. However, the interesting aspect of our solution is not the algorithm, rather 
the way the distributed computation is deployed into the network. 
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We assume that no nodes are initially able to take part in a leader election. 
The distributed algorithm is started by injecting into the ring a process that 
contains the necessary knowledge about the distributed computation — a voter. 
This process clones itself repeatedly until the whole ring is populated with voters. 
Interestingly, voters do not contain the logic associated with the token, i.e., they 
do not know how to compare the node’s value with the token’s value — the poll 
strategy. The knowledge about this key aspect of the algorithm is injected into 
the ring in a separate step of the computation in the form of a code unit which is 
placed on an arbitrary node of the ring. Each voter is able to detect the presence 
of the poll code unit on its node and move it into its own scope, thus effectively 
enabling the execution of the unit. The poll code unit has access to a node-level 
data unit that contains the node value. This enables the comparison needed to 
vote. Again, a self replicating scheme is employed, where each voter passes on 
a copy of the unit to the next node in the ring. This structure of the system, 
where the poll strategy is kept separate and is loaded dynamically into the voter, 
enables the dynamic reconhguration of the ring. This happens when a new code 
unit that contains a different poll strategy is injected in the ring. Again, voters 
detect its presence on their sites and replace the old strategy with the new one. 
Finally, when the token is injected into the ring, the actual leader election starts. 

Our example, despite its simplicity, highlights many of the leitmotifs of mo- 
bile code: simultaneous migration of the code and state associated with a unit 
of execution, dynamic linking (and upgrade) of code, and location-dependent 
resource sharing. For instance, our solution can be easily adapted to an active 
network scenario where a new service (in our case the ability to perform leader 
election) is deployed in the network, and some of its constituents (in our case 
the poll strategy) are dynamically upgraded over time. 

A formal specification of our leader election algorithm is shown in Figure 4, 
while Figure 5 shows its graphical representation. The specification uses the 
fine-grained mobile code constructs of our model. The upper part of the specifi- 
cation contains three program definitions. NodeDefinition specihes a single data 
unit X associated with a node. Note how the type declaration for this integer 
variable is prepended by the keyword var which characterizes the variable as a 
data unit. Similarly, TokenDefinition specifies a data unit associated with the 
variable token. The values of these two variables are accessed (through shar- 
ing) by code units specified by the program PollActions. The latter contains a 
single statement poll, which describes the polling strategy. As discussed in the 
next section, the formal semantics of the model prescribes the execution of this 
statement to be prevented when the corresponding code unit is not within the 
scope of any process. Thus, the comparison in poll is performed only when the 
corresponding code unit is co-located in a voter process that also contains the 
data unit corresponding to token. In this case, the binding rules of the model, 
expressed using the transient variable sharing abstraction provided by Mobile 
Unity, effectively force the same value in both token variables, hence enabling 
the comparison specified by poll. Simultaneously, an additional auxiliary boolean 
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System LeaderElection 
Program NodeDefinition 
declare 

X: var integer 

end 

Program TokenDefinition 

declare 

token: var integer 

end 

Program PollActions 
declare 

token: integer [] x: integer |] voted: boolean 
assign 

poll: token, voted := min(a:, token), true 

end 

Program VoterActions 
declare 

voted: var boolean |] startup: var boolean [] token: integer [] x: integer |] k: integer 
initially 

voted = false |] startup = true 

assign 

startVoter: put(i;oter, thisNode, , ) if , = node(O) 

reference(a:, thisNode) startup := false if startup 
[| linkCode: move(poZ/, thisNode, here) 

put(poZ/, thisNode, , ) if , = node(O) 
destroy {poll , here) if exists{poZ/, thisNode) 

W passToken: move(to/cen, thisNode, here) if exists(to/cen, thisNode) 
move(to/cen, here, () 

voted :— false if voted ex\sts{token, here) 

end 

Components 

y i : 0 i < N :: new'Data{NodeDefinition, x, node{i),i) 

[] newData(ToiceiiDefirjitioii, toAien, node(O), ) 

[] newCode{PollActions, poll, node(O)) 
y newProcess( VoterActions, voter, node(O), active) 

end 



here A 

Auxiliary definitions: thisNode head(A) 

next(n) the node following n in the ring 



Fig. 4. Specifying leader election in Mobile UNITY extended with fine-grained 
mobile code constructs. The Interactions section is assumed to embody the 
semantics of the refined model (see Section 5). 



variable voted is set to signal to the enclosing voter, again by means of sharing 
of the variable voted, that the token needs to be passed along the ring. 

Voters are specified by the program VoterActions, that declares the variables 
mentioned so far and an additional boolean startup that is used to determine 
whether it is necessary to perform some initialization tasks, i.e., cloning the 
voter itself on the next node to perform the initial deployment of processes in 
the ring, and acquiring a reference to the node’s value. These tasks are performed 
simultaneously by the statement startVoter, which also resets startup to prevent 
the creation of multiple clones of the voter. In startVoter, cloning is performed 
by the put operation. It executes only if the voter that is invoking the operation 
does not immediately precede in the ring node(O) where the whole computation 
started. Thus guarantees that each node hosts a single voter. The statement 
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Fig. 5. Leader election with mobile code. 



uses some of the auxiliary definitions shown at the bottom of the figure. In 
particular, here and thisNode are just renamings of the location variable A in 
the voter and of the head function that operate on it, respectively. They serve 
the sole purpose of improving readability. While the location of a process is 
always set to the name of a site (as processes reside directly on the site), unit 
location can refer to sites or to processes. In the latter case, the location is 
defined as the concatenation of the name of the site the unit reside on and 
of the name of the process that holds it. This is useful in invoking the put 
operation whose most general form is put{name,prog,id,locationdest) where 
the first three parameters are the three indices of the component to be copied 
and locatioudest is a location that represents the destination of the copy. Another 
form, put [name, locationcur,locationdest), is also provided. It is actually used 
in the example to “query” the scope defined by locatioucur for the second and 
third indices of the component given the name (i.e., first index). 

As will become clearer in the next section, copying takes place behind the 
scenes by picking a fresh component from the ether and setting its location to the 
one passed as a parameter. Like most of the operations provided in our model, 
the put operations is defined on components, i.e., both on processes and units. 
In the case of processes the copying is performed recursively on the process and 
on all its constituent units. In the case of put, the bindings that a process may 
have established are not preserved as a consequence of this copy operation, i.e., 
all the variables are restored to their initial values. This represents a “weak” form 
of copying. Our model provides also a stronger notion with the clone operation, 
which preserves all the bindings owned by the process. 
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The statement start Voter establishes also a reference to the variable x, whose 
value is contained in a data unit instantiated on each site. To understand in 
more detail this latter aspect, let us take a brief detour and jump temporarily 
to the Components section, to look at the initial configuration of the system. 
The first statement uses the operation newData to create a data unit named 
X using the definition provided in the program NodeDeHnition, assigns to it the 
value i, and places it on the node. Since the statement is quantified over the 
number N of nodes in the system, each node hosts an instance of the data unit 
as a result of the operation. 

Similarly, the other three statements in the Components section create on 
the first node the data unit for the token, the code unit for the poll strategy, 
and the voter process, respectively. Given the nature of our model, which enables 
movement to the level of a single Mobile Unity variable or statement, it is inter- 
esting to note how VoterActions actually represents the unit of definition for a 
number of units, namely, the data units corresponding to voted and star'tup, and 
the code units corresponding to startVoter, linkCode, and passToken. In princi- 
ple, each of these could be moved or copied independently. Since this is not the 
case in this example, they have been grouped together under VoterActions. This 
simplifies the text of the specification by minimizing the number of Program 
declarations, and also enables the creation of a single process that automatically 
contains instances for all the aforementioned units by using newProcess. Fi- 
nally, note how the value of a process is its activation status, i.e., either active 
or INACTIVE. 

Now, let us return to the reference operation in startVoter. Thanks to the 
binding rules, this operation establishes a transient sharing between the variable 
in the data unit x defined in NodeDeSnition and the declaration in the voter. 
Similarly to what was described for put, only the name of the data unit x is 
specified, while its identifier is determined by implicitly querying the node. The 
model provides also the inverse operation unreference. 

The statement linkCode takes care of replicating the poll strategy and, possi- 
bly, of substituting the new poll code for the old one. It executes only when the 
exists function in the guard evaluates to true. The function exists, formally in- 
troduced in the next section, effectively models the aforementioned query mech- 
anism, and enables linkCode to execute only when a code unit with name poll 
is found on the node. If the unit is found, the move operation brings it within 
the process, thus enabling its execution. Simultaneously, a copy of the unit is 
sent to the next node in the ring via a put, provided that the next node is not 
node(O). At the same time, if a pre-existing poll unit is found in the process the 
destroy operation removes it from the system. 

Finally, passToken handles the movement of the token. Again, the query 
mechanism is used to get implicitly the identifier of any token data unit present 
on the node and move it within the process to establish the proper bindings. 
After the poll is performed, i.e., voted is set to true, the token is moved from the 
scope of the voter to the next node in the ring. 
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find {u, 1) 


min i, j : Uij .X = 1 w {u, i, j) 


find(n, i, 1) 


min j : Uij.X = 1 :: {u, i,j) 


exists(n, 1) 


i,j :: Uij.X = 1 


exists(u, i, 1) 


j :: Uij.X ^ 1 



Fig. 6. Specification of the functions find and exists. 



5 Formal Semantics 

Our general strategy is to reduce the new model for code mobility to a special- 
ization of the standard Mobile Uxity notation and proof logic. The first step, 
explained in the previous sections, shows how we reinterpret a notation which 
looks very close, if not identical, to that of Mobile Unity by simply treating 
each variable declaration and statement as a separate, independent program. 
Multiple instantiations of each such fine-grained program, called a unit, are de- 
fined in the Components section. Once this transformation from a concrete to 
an abstract syntax is completed, the parts of the model still missing arc the me- 
chanics of data sharing within the confines of each process, the control over the 
scheduling of statements for execution, and the definition of the various mobility 
constructs. Our strategy is to capture all these semantic elements as statements 
present in the Interactions section of the Mobile UNITY system and to disallow 
the designer from adding anything else to the Interactions section. The result 
is a specialization of Mobile Unity to the problem of fine-grained mobility. The 
fact that the entire semantic specification can be reduced to a small set of coor- 
dination statements attests to the flexibility of Mobile Unity. In the remainder 
of the section we consider in turn the topics of scoping, statement scheduling, 
and mobility constructs. Prom now on we use the compact notation Cij to mean 
i.c., the instance j of the component named c extracted from program 
i. Throughout this section we also assume that: 

— Each component, (i.c., data unit, code unit, or process) Cij is character- 
ized by its location (cij.X), request held (cij.p) designed to hold mobil- 
ity commands the system is expected to execute on its behalf, and type 
{cij.T e {dataUnit,codeUnit, process}). 

— Each process qij is also characterized by an implicitly specified set of con- 
tained units (those located within the process), a set of referenced units 

and its activation status {qip.oj G {ACTIVE, INACTIVE, terminated}) 

The designer does not need to refer to any of these attributes even though they 
are essential to the formal semantic definition. 

When writing code, the designer will typically refer to a component’s name 
(e.g., c) rather than its fully qualified name (e.g., Cjj) consisting of the three 
indices (i.e., c, i, j) dehning the component name, program, and index, respec- 
tively. Given the name, the other identifiers can be extracted easily by employing 
the functions find and exists defined in Figure 6. 
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The find function finds an instance of the component named u on the location 
1. The name of the program the unit is derived from (i.e., i) can be added as a 
parameter in order to constrain the search only to units derived from a particular 
program definition; the same is true for the function exists. Processes, like other 
units, also have three indices: the first index is the name of the process, the 
second is the name of the program the units in the process are derived from 
(e.g., the process voter created with newProcess in the Components section 
of Figure 4), and the third is the instance discriminator. 

Scoping Rules. Since a code unit can only access its own variables, the mecha- 
nism by which we establish scoping and access rules is that of forcing variables 
with the same name and present in the same scope (i.e., contained in the same 
process) to be shared. This can be readily captured by employing one of the 
high level constructs of Mobile Unity, transient variable sharing across pro- 
grams {A. a « B.b when p). The predicate p controlling the sharing simply 
needs to capture the scoping rules. Figure 7 shows how these rules can be stated 
as two Mobile UNITY coordination statements. Statement 1 handles sharing be- 
tween a variable in a data unit and a variable in a code unit, while statement 2 
defines the sharing between two variables in data units. 

Statement 1 states that variables^ Ui^h-x and Wj^k-x share the same value 
when Ui^h is a data unit and Wj^k is a code unit, and the two units are within the 
same process, or either the data unit or the code unit is referenced by the process 
owning the other unit and the two units are on the same site. The engage value 
is the value of the variable in the data unit. The two disengage values are 
the actual value shared for the data unit variable, and the undefined value for 
the code unit variable, respectively — variables in code units are not supposed to 
carry a value unless they are sharing it with a data unit. The function sharing 
tells if two units have a common “parent” (a parent can be the process within 
which they arc located or the one which references them), i.e., the units arc 
in the same scope. In turn, sharing uses the functions childOf(uj^fc, that 
determines whether Vj^k is child of u-i^h (i-e., vj^k is a unit contained in Ui^h), and 
referencedBy(uj^fc, Ugh), that determines whether Vj^k is referenced by ug/j. 

Statement 2 defines sharing between variables in two data units. The variables 
must have the same name in the same scope. Sharing takes place under the same 
conditions of statement 1, except that both variables are in data units. The 
engage clause forces the two variables to share the maximum value. Different 
policies can implement a different semantics for reconciliation of values. As no 
disengage is specified the variables retain the values they had before the when 
condition became false. The update of all shared variables must happen in the 
same atomic step as the assignment to any of them. However, sharing is specified 
separately from the (possibly many) assignments that may change the value of a 

^ The formulae in Figure 7 and following assume that variable sharing is well-defined, 
i.e., it takes places only among variables which actually appear in the specification of 
a unit according to the program definition. Also, distinguished variables like A and 
T are never shared. The formal definition of these conditions is omitted for the sake 
of brevity. 
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when Ui^h.T = dataUnit = codeUnit 








(1) 




(sh3nng{ui^h,Wj^k) head(ni^/^ . A) = head(itij^fc . A))) 






engage Ui^h-X 






disengage 




Ui^h-X Wj^k.X 


when Ui^h-T = — dataUnit 


(2) 




{uij.X = Wj^k-X — head{wj^k-X) 

(sharing(ni,h,'ii'j,fc) head(ni,h-A) = head{wj^k-X))) 






engage max{ui^h-x,Wj^k-x) 




inhibit Ui ^.s 


when Ui^h.T = CODEUnit 


(3) 




( p,m,n : Pm, n-T ~ PROCESS (childOf(ni^/^,PTn,n.) 

referencedBy(iii,h ) Pm,n)) ^ Pm,n-<-^ = active 






II 






Auxiliary definitions: 




sharing(tii,fc, = 


p,m,n :: (childOf(it;j,fc , •'eferencedBy(^ii^/^ , PTn.n.)) 

(childOf(uj^/^ , Pm,n) referencedBy(Lt;j^fc , Pm,n)) 




childOf(i;j,fc,Mi,fc) = 


true Vj^k-X = Ui^h-X (u,i,h) 

false otherwise 




referencedBy(t)j fc , tti = 


true if (v,j,k) Ui^k.-7 

false otherwise 



Fig. 7. Establishing bindings among units using transient variable sharing and 
statement inhibition. 



variable. To accomplish this, Mobile Unity has a two-phased operational model 
where the first phase involves an ordinary assignment statement execution and 
the second is responsible for propagating changes to shared variables. We call the 
statements that execute in the second phase reactive statements. Logically, the 
set of reactive statements are executed to hxed point right after each non-reactive 
statement and one reactive statement may trigger the execution of other reactive 
statements. Transient sharing is ultimately defined using reactive statements [6], 
but this is outside the scope of this paper. 



Statement Scheduling. In Mobile UNITY, each statement is assumed to be 
executed infinitely often in an infinite execution, i.e., weakly fair selection of 
statements is the basis for the scheduling process. The coordination constructs 
of Mobile Unity include a construct for guard strengthening called inhibit. 
In inhibit s when p, for instance, the statement s continues to be selected as 
before, but its effect is that of a skip whenever the condition p is not met. We take 
advantage of this construct in statement 3 of Figure 7 to inhibit statements not 
in the scope of an active process, and statements that have unbound variables. 
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A variable appearing in a statement is always unbound if it is not shared with a 
variable present in a data unit. 

Mobility Constructs. The designer views the move construct as a mechanism 
by which a component at one location is relocated to another. The new location 
may be a known site or a known process. This form of the move construct: 

raave{compN ame, currentLocation, new Location) 

is actually a special instance of the more general form in which the identity of 
the unit is already known. One can simply determine the identity by employing 
the function find as in"^ 

mave{i\nd{compName, currentLocation) , newLocation). 

If multiple instances of the same unit exist one is selected”. In order to explore 
the manner in which we assigned semantics to the mobility constructs associated 
with our model we will focus our presentation on the general form of the con- 
struct. Moreover, we will assume that the unit in question is a process named q 
with identifier {i,j) destined for location 1: 

move{q,i,j,l). 

Our general strategy is to treat the operation as a macro reducible to a simple 
local assignment statement to the distinguished variable p (see Figure 8): 

p := (REQ,MOVE,g,i, j, (0) 

where the first two fields of the record stored in p indicate the propagation status 
(i.c., an initial request) and the nature of the request (i.c., a move). 

We delegate the actual execution of the operation to a series of coordination 
statements built into the Interactions section. The coordination statements 
propagate the request to the contained units and ultimately carry out the mi- 
gration of the individual components to the new location. All these actions are 
executed atomically because they are encoded as reactive statements that exe- 
cute to fixed point before the system is allowed to take any other action. The 
first thing that happens is to have the request transferred in the form of a com- 
mand to the process q. The result is that qi^j.p is assigned the request with a 
propagation status of EXEC: 

qij.p := (exec, MOVE, (1)) 

while the attribute p of the unit issuing the request is cleared. Of course, in 
general it might be the case that a unit requests its own movement and one 
needs to distinguish between the two cases as made evident in Figure 9. 

If, for the sake of simplicity, we assume that the only units contained by q 
are dm,h and Sk,n, the next reaction being triggered leads to having the process 
ready to start the move, a fact indicated by dropping the propagation status 

q, j.p := (move, (1)) 

^Throughout, we assume that move{{q,i, j),l) is unambiguously reducible to 
move(q, i,j,l). 

® We chose to pick up the instance with minimum index. 
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move(n, i, j, 1) 


P 


= (req, move, u, i, j, (1)) 


put(ii, i, j, k, 1) 


P 


= (req, put, u, i, j, (getid(n, i), 1)) k := getid(n, i) 


clone(it, i, j, k, 1) 


P 


= (req, clone, u, i, j, (getid(n, i), 1)) k := getid(n, i) 


destroy(u, i, j) 


P 


= (req, DESTROY, U, i, jf, 0) 


activatc(u, i, j) 


P 


= (req, ACTIVATE, U, i, j, ()) 


deactivate(ii, i, j) 


P 


= (req, DEACTIVATE, U, i, j, ()) 


terminate(n, i, j) 


P 


= (req, TERMINATE, U, i, j, ()) 


new(u, k, 1) 


P 


= (req, NEW, getid(n), (/)) k ■= getid(n) 


reference(u, i, j, v, k, h) 


P 


= (req, reference, u, i, j, {v, k, h)) 


unreference(it, i, j, v, k, h) 


P 


= (req, unreference, u, i, j, {v, k, h)) 


. .,. , .,. getid(name) f\nd(name,e) 

Auxiliary definitions: -\ r j/ • \ 

getid (name, 2 ) tind(name, i, e) 



Fig. 8. Mapping mobility constructs to Mobile Unity statements. 



(4) 

Wj_k'P= Wj^k = '^i,h .p = (exkc, command, n, i, /i, ar^s) 

reacts-to Wj^k-P = command, u,i, h, args) 

(5) 

Ui^h-P =■ {command, args) v,n,m : childOf(u„,rn , ) toPropagate(commanci) :: 

'^n,m-P = (exec, Command , v, n, m, J-{command, u, i, h, args)) 
reacts-to ui^h-P — {E.y^'E.C, command, u,i,h, args) 



Return values for 



^(move, u, i, h, (1)) = {I {u, i, h)) 
J^{r\.:T,u, i, h, {{u, j, k), 1)) = {I (u,j,k)) 
jF(clone, u, i, h, {{u, j, k) , 1)) = (l {u, j, k)) 
:F(destroy, u, i, h, ()) = 0 



Fig. 9. Modeling the actions of the run-time support. 



while simultaneously propagating the command to the contained units (see Fig- 
ure 9), e.g., 

dm h-p ■= (exec, MOVE, d, m, ft, {I o {q,i,j))) 

Sk n.p := (exec, move, s, k, n, {I o 

Figure 9 defines the function r that computes, in a command-specific man- 
ner, the arguments needed by the contained units. In this case, the location to 
where they need to move is the relocated process. Since further propagation is 
no longer possible the commands drop the propagation status in the next step 

dm h-P := (move, d, m, ft, {I o {q,i,j})) 

Sk n-p ■= (move, s, ft, n, {I o {q,i,j))) 

The last step is the change in location of each of the units (Figure 10). Given 
the semantics of Mobile UNITY, this may happen in any order but the reactive 
statements will be executed again and again until fixed point is reached, i.e., 

q^j.X = l A dm I o {q,i,j) A Sk n-X = I o {q,i, j) 
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(6) 

(7) 

( 8 ) 

(9) 

(10) 

( 11 ) 

( 12 ) 

(13) 



Ui^h.X:=l if .r = PROCESS / = head(/)) .u; = TERMINATED Ui^h.\ = e 

Ui^h-P reacts-to Ui^h-P = (move, (1)) 

Uj^k-X,Uj^k-<-^ '■= l,Ui^h-UJ if (iii,h .T = PROCESS Z = head(Z)) Ui^h-X = c 

Ui^h.p := reacts-to Ui^k..p = (put, ((u, j,k),l)) 

.A, := if {ui^k..r = PROCESS I = head(Z)) Ui^^.X = e 

Ui^h .p X :: Uj^k -X '■= '^i,h -x: reacts-to Ui^h -P — ((R.one, ((u, jf, fc), Z)) 

Ui^h-X := if Ui^h-X = e Ui,h-P := reacts-to ui^h-P = (destroy, ()) 

.u; := ACTIVE if = INACTIVE .r = PROCESS lii,h-A = C Ui^h-P = 

reacts-to Ui ^.p = (ACTIVATE, ()) 

.u; := INACTIVE if lii,h = ACTIVE .T = PROCESS — e Ui^h-P- — 

reacts-to Ui^h-P = (DEACTIVATE, 0) 

Ui^h-OJ TERMINATED if Ui^h-OJ = TERMINATED Ui^h-T ~ PROCESS Ui^h-^ = C 

Ui^h-P '■= reacts-to Ui^h-P = (terminate, ()) 

Ui^h-^ '■= Z if Ui^h-x = PROCESS Z = head(Z) ui^h-P '■ = 
reacts-to ui^h-P = (new,Z) 



Ui^h-'y '■= if .T = PROCESS .T = PROCESS lii,h-A = C 

Vj^k-^ = e '^i,h P = reacts-to Ui^^.p = (REFERENCE, {v,j,k)) 

(15) Ui^h-J ■= Ui^h-'y i'V,j,k) Ui^h-P '■= reacts-to -P = (UNREFERENCE, (u, jf, fc)) 



Fig. 10. Migrating components. 



All other constructs function in a similar manner except that not all the com- 
mands are propagated to the contained units. For instance, terminate affects 
only the status of the process. The function toPropagate used in Figure 9 is de- 
signed to control the propagation process: the propagating constructs are move, 
put, clone, and destroy. The construct getid returns the three-part identity of 
a component located in the ether. A minimal lexicographical value for the triplet 
is selected. The complete list of commands and the corresponding formalization 
appear in Figures 8 and 10. 



6 Conclusions 

This paper can be regarded as a follow-up on our earlier work on modeling mobile 
code paradigms using Mobile UNITY [10]. By contrast, the model presented 
in this paper adopts an unusually fine level of granularity by considering the 
mobility of code fragments as small as single variables and statements. Our 
primary goal was to demonstrate the feasibility of specifying and reasoning about 
computations involving hne-grained mobility. Nevertheless, the study has been 
instrumental in helping us develop a better understanding of basic mobility 
constructs and composition mechanisms needed to support such a paradigm. 
Composition and scoping emerged as key elements to the construction of complex 
units out of bits and pieces of code. The need for both containment and reference 
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mechanisms was not in the least surprising given current experience with object- 
oriented programming languages but it was refreshing to rediscover it corning 
from a totally new perspective. The distinction between the units of definition, 
mobility, and execution proved to be very helpful in structuring our thinking 
about the design of highly dynamic systems. The necessity to provide some form 
of name service capability (the find function) appears to align very well with the 
current trend in distributed object processing. The next step is to revisit fine- 
grained mobility from a more pragmatic perspective, one which will encompass 
both the design of a fine-grained mobile code system and its use in distributed 
applications. 

Acknowledgments. This paper is based upon work supported in part by the 
National Science Foundation (NFS) under grant CCR-9624815. 
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Abstract. This paper presents an architectural style for real-time systems, and 
an associated formal architectural description language, called Robots. A basic 
specification in Robots consists of a synchronous control task that is responsible 
for the dynamic reconfiguration of the system controller as a set of 
asynchronous observer and process tasks. The controller architecture evolves 
by hierarchical refinement of observers and processes into lower level control 
tasks each dominating a new set of observers and processes. Robots is given 
operational semantics by statecharts. Also, the architectural style is embedded 
in Robots by semantic rules that allow formal checking of the consistency and 
completeness of architectural specifications. 



1 Introduction 

Software architecture is concerned with the organization of the software as a set of 
interacting components. An architectural style specifies the kinds of components and 
connectors that may be used to compose a system, and defines constraints on the way 
the composition is done [13]. The actual architecture of a software system affects to a 
large extent the capabilities of program analysis, understanding, maintainability, and 
verification with respect to the system requirements. Therefore, it is important to 
develop the software design according to a well-defined architectural style that would 
lead to a rigorous, yet simple and clear structure of the software architecture. The 
effectiveness of architectural styles is considerably increased by their formalization 
by ADLs (Architectural Description Languages), in which case automatic code 
generation, and mechanization of analysis and verification procedures, are possible. 

Real-time systems introduce special architectural requirements. Such systems 
consist of a controller (an embedded computer) aimed at the stabilization of an on- 
going process at a required state dictated (dynamically) by a commanding 
environment. The controller continuously reacts to the environment data and events 
by computed responses that must be accomplished within hard deadlines such that the 
reaction effect will be relevant to the present process state (the real-time requirement). 

Typical real-time systems (e.g., autopilot, traffic control, production line) are 
characterized by complex behaviors which involve concurrent control of many 
interacting sub -processes. In addition, such systems are large and have complex 
structural requirements (multi-level, distributed). Corresponding software architecture 
must describe the partition of the program into concurrent computation threads, the 
dynamic reconfiguration of the set of active threads following mode transitions, and 
the time constrained interaction between the active components at each mode. 
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In this paper, we present an architectural style for real-time systems and its 
implementation by an ADL, called Robots. A basic specification in Robots consists of 
a synchronous control task that is responsible for the dynamic reconfiguration of the 
system controller as a set of asynchronous observer and process tasks. The controller 
architecture evolves by hierarchical refinement of observers and processes into lower 
level control tasks each dominating a new set of observers and processes. Robots is 
given operational semantics by statecharts. Also, the architectural style is embedded 
in Robots by semantic rules that allow formal checking of the consistency and 
completeness of architectural specifications. Thus, it is possible to perform automatic 
program analysis that can be used for program understanding and verification. 

Though the presentation in this paper is rather informal, we hope it gives precise 
notion of the language and its semantics. Thus, Section 2 provides a description of 
Robots through a worked-out example. Section 3 describes, informally, the semantics 
of Robots. Section 4 discusses related work, and Section 5 describes the current 
status, and future work. 

2 The Architectural Style of Robots 

The common approach to real-time design employs tasks as the building blocks of 
real-time programs. Each task forms an independent thread of execution. A program 
is configured as a set of interacting tasks that can be dynamically reconfigured by 
creation and deletion of tasks, thus adjusting the controller operation to the current 
working mode. Inter-task synchronization takes place through events. More 
specifically, a task can alternately signal events while execution, or suspend its 
operation until the occurrence of a certain event (or one of a set of events, in the 
general case). 

An architectural style for real-time systems provides means for representing the 
dynamic reconfiguration of the set of active tasks along the system operation, and a 
scheme of interaction between the active tasks at each mode. 

Robots is a description language that supports a specific architectural style of real- 
time systems. This section presents Robots and the associated architectural style 
through a worked out example of an automatic cruise control system. This example 
has been extensively used to demonstrate and compare architectural styles of real- 
time systems [13]. Hence, it can be instructive to evaluate our approach. 

2.1 Automatic Cruise Control System 

The Automatic Cruise Control (ACC) system is intended to maintain the speed of a 
car under the driver command. The ACC interface with the driver (Fig.l) consists of a 
master switch (on/off), a resume button, the driving control pedals (brakes and gas), 
and a three-state command lever indicating driver requests for maintaining (M), 
decreasing (D), or increasing (I) the speed of the car. The ACC interaction with the 
car motion system takes place through a discrete line indicating whether the engine is 
running, a speed and an RPM meters, a throttle, and a gear integrator. 

The ACC is enabled only when the engine is running, in which case it takes over the 
speed control whenever the master switch is turned on. The control is released upon 
turning the master switch off. While active, the ACC maintains the speed recorded 
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Fig. 1 . Interface chart of the ACC system 



upon entering this state by coupled control loops over the throttle and the gear^. The 
driver may require decrease or increase of the desired speed by moving the command 
lever to the corresponding position. The control operation is immediately suspended if 
any of the driving pedals are pressed. The driver may return control to the ACC by 
pressing the resume button (in which case the ACC continues to maintain the last 
speed command). 

2.2 Basic Architectural Structure 

The basic architectural structure of Robots is a robot, a system comprises a set of 
hierarchically related robots. For example. Fig. 2 illustrates the software architecture 
corresponding to the top-level robot of the ACC. In general, a robot consists of three 
types of tasks (indicated by different shapes in the figure) that interact according to a 
definite scheme. Specifically, we employ the following task-types. 

• A single control task whose role is to maintain the operational mode of the 
controller (the task ACC in the present example). It operates by responding to 
mode transition events with proper reconfigurations (creation and deletion) of the 
set of active tasks. 

• Observers are tasks whose role is to identify and report to the control task the 
occurrences of events in the robot environment (both the commanding 
environment and the process under control). For example, the task Engine is an 
observer intended to identify the process events engine-on and engine-ojf 
(indicate the engine start and stop running, respectively). On the other hand, the 
task Switch is an observer that identifies the commanding events start-acc and 
stop-acc (indicate the turnings of the master switch on and off, respectively). 

• Processes are tasks that perform actual control operations that regulate the 
behavior of the controlled process. In our example, the process CruiseControl is 
responsible for maintaining the actual automatic cruise control process. 



* The gear control requirement is not part of the original problem. It has been added in order to 
demonstrate composition of synchronizing processes. 
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t 

CruiseControl 



Fig. 2. The top-level software architecture of the ACC system 

The idea underlying a robot construction is that at every state of the system 
operation the control task is responsible for maintaining active the processes that 
perform the operation required at the present mode, and the observers that identify the 
events that might cause a transition to a following operational mode. For instance, the 
task Engine must be active as long as the ACC is operating. On the other hand, the 
task Switch must be kept active only while the engine is running, and the task 
CruiseControl is active only when the switch is on while the engine is running. 

Thus, in general, a robot comprises a single control task that manipulates the 
creation and deletion of a number of processes and observers which in turn signal the 
control task with the occurrences of events that may cause further reconfiguration of 
the set of active tasks. 



2.3 Robots: The Architectural Description Language 

A robot specification in Robots comprises textual and visual descriptions to represent 
a real-time subsystem in the architectural style described above. For instance, the 
following robot represents the top-level specification of the ACC. 



robot ACC 
control: 



ACC 

^engine-off* Disabled 



A 

Inactive 



Switch 
— start-ace* 
*-stop-ace- 

Enabled 



Engine 

-engine-o, 



Active 

^ CruiseControl | 



where 

observer Engine signals: { engine-on, engine-off }. 
observer Switch signals: { start-acc, stop-acc }. 
process CruiseControl. 
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The first line identifies the robot by a unique name (which is also used as the 
identifier of the control task that follows). The body of the robot consists of the 
specification of the robot (unique) control task followed by a “where” section that 
specifies the interface of the servers (the observers and processes) that are 
manipulated by the control task. 

The control task behavior is expressed by a sequential statechart [8] that is 
augmented with task-stickers placed on the frames of the states. A task-sticker bears a 
task name and has a shape that indicates its type: rounded-corners rectangles denote 
observers, and normal rectangles denote processes. Operationally, the intended 
meaning of task-stickers that are attached to a state is that the corresponding tasks are 
respectively created and deleted upon each entrance and exit of the state. 

In our example, the top-level behavior, given by the ACC statechart, consists of two 
states: Disabled (initial) and Enabled. The transitions between these states are caused, 
respectively, by the events engine-on and engine-ojf ihai are identified by the observer 
task Engine which is always active (indicated by the task sticker placed on the most 
outer state frame). The state Enabled consists of two substates: Inactive (initial), and 
Active. The transition from Inactive to Active is due to the event start-acc (identified 
by the observer task Switch that becomes active upon the entrance to the state 
Enabled). In the other direction, the event stop-acc will cause the transition from 
Active to state Inactive. Within the state Active, the process CruiseControl that 
maintains the actual cruise control operation, is active. 

The specification of a server interface starts with the type identifier: observer, or 
process, followed by the task name. Observer’s specifications, like Engine and 
Switch, also contain the field “signals” that lists the events that are identified by the 
task. The specification of the process interface, CruiseControl, contains no additional 
information. Note that the specification of a server type is redundant (it is already 
given by the shape of the task-sticker) however this form was encouraged by Robots’ 
users, for the sake of readability. 

The server tasks are classified to be either in the status to-be-refined (as indeed are 
the tasks in the present example), or in the status executable if the specification also 
contains the activation conditions of the task. In a complete system specification, 
every to-be-refined task must have an associated refinement (see next section). 

It is worthwhile mentioning the constraints imposed on the functionality and 
interaction of tasks as illustrated by the robot ACC. For instance, 

• creation and deletion of tasks are allowed only by the control task, 

• the existence of a definite source for every event, 

• events flow in a fixed direction: from observers to the control task. 

Such constraints simplify to a large extent the validation of the consistency and 
completeness of a design. For instance, given a robot specification, it can be easily 
verified weather or not the observers activated at each level provide for the 
identification of all (and only) the events that might cause state transitions at that 
level. Actually, the satisfaction of such constraints can be deduced automatically due 
to the formal semantics of Robots (see Section 3). 
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2.4 Structures of Robots 

A robot describes a single control process at a certain level of abstraction. In this 
section we present two relations between robots, refinement and composition, that 
enable the design of a large system as a hierarchical structure of concurrent robots. 

Refinement. One way to generate a structure of robots is by stepwise refinement of 
their servers into new robots (Fig. 3). Both observers and processes, but not the 
control task, are eligible for refinement, provided that their specification status is to- 
be-refined. The only commitment of a refinement is to respect the interface 
specification of the refined task as it is declared in the host robot (the robot whose 
task is refined). 




Fig3. . Task refinement 

The interaction between a host and its refinements takes place in two levels. First, the 
host controls the activations of its refinements (actually accomplished by the creation 
and deletion of their control tasks). Then, active observer refinements synchronize the 
host operation by signaling events to its control task (these events are generated by 
the refinements’ control tasks). 

In the robot ACC, all the servers are eligible for refinement since they are all 
specified as to-be-refined. We start with a refinement of the process CruiseControl 
given by the following robot. 

robot CruiseControl 
control: 





: Pedals ; 


A Resume A, 

short ^ 


Operating 


Suspended 




■* resume 


SpeedControl ^ 


'a 



where 

observer Pedals signals: { abort }, activation: lnterrupt#3, deadline:20ms. 
observer Resume signals: { resume }, activation: perlodlc#20Hz. 
process SpeedControl. 

The robot CruiseControl employs the observers Pedals and Resume to control the 
activation of the process SpeedControl. The robot starts in state Operating where 
SpeedControl is active. The event abort, signaled by Pedals to designate the driver 
press either on the gas or the brake pedals, will cause transition to state Suspended, in 
which case the task Resume becomes active. The return to state Operating is caused 









Robots: A Real-Time Systems Architectural Style 63 

by the event resume that is signaled by the task Resume whenever a press on the 
resume-button has been identified. 

The interface declarations of the observers Pedals and Resume are enriched with 
two fields: “activation”, and “deadline”. The first specifies the activation event of the 
task. There are two types of activation events in this example: 

• periodic (in the task Resume) indicates periodic activation of the task at the rate 
denoted by the associated parameter, 

• interrupt (in the task Pedals) indicates activation by an hardware interrupt 
identified by the associated integer. 

In general, we allow activations by any form of sporadic events (in particular events 
signaled by terminations of tasks (see below). 

The field “deadline” specifies the maximal allowed duration, measured from the 
occurrence of the activation event, till the termination of the execution of the task. In 
case of an interrupt driven task, like Pedals, the deadline must be explicitly specified. 
For periodic activation, if the deadline is not specified (as in the case of Resume) 
then the default is the task period. 

A server specified with an activation field is defined to be executable. In our model 
(see Section 3.2), an executable task is a pure data-processing, terminating program 
(one that contains no synchronization with the environment). A specification in 
Robots provides only the activation conditions of the task.. The program is executed 
in response to every occurrence of an activation event, and every termination of the 
program is considered an event that is signaled to the local control task. The event is 
identified either by the value returned by the program computation, or the default 
value done that indicates just the fact of a termination of the program execution. In 
normal execution the program termination is expected before the specified deadline 
expires. Otherwise, if the deadline is exceeded, the whole execution of the system is 
considered erroneous. Note that the program specification of an executable task is not 
part of the Robots specification (see Section 2.6). However, the forbiddance of 
synchronized interaction with the environment ensures that whenever computations 
are added to the system specification they cannot invalidate the behavioral properties. 

Next we demonstrate a refinement of the task Switch (the refinement of Engine 
behaves similar to that of Switch hence its description is omitted). 

robot Switch 

signals: { start-acc, stop-acc }. 
control: 




where: 

observer SwitchOff signals: {sw-off}, activation: periodic#! OHz. 
observer SwitchOn signals: {sw-on}, activation: periodic#! OHz. 

The field “signals” declared before the control task, specifies the events that are 
signaled to the robot environment (in this case the ACC control task). In general, the 
declaration of a “signals” field with a robot interface extends the consistency 
commitments. Namely, it must be checked that the specified events are actually 
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generated by the control task. In our example, the control task Switch generates the 
interface events while transiting between the states Off and On in accordance with the 
corresponding master switch transitions. Note that the interface events are identified 
by different names within the robot (for instance start-acc is sw-on). This is not 
mandatory, however we find it advantageous as it indicates different levels of 
abstraction. 

Composition. A composition of robots describes concurrent processes that 
synchronize each other executions via events. The composition of robots is a special 
robot that specifies a matching of the events that are signaled in one robot and waited 
in another robots of the composition. For example, let the following requirements 
define the speed control process in the ACC. 

1. The process controls the gear transitions between three positions: 1, 2, D, 
according to the engine RPM. 

2. The speed control is operated in two modes: normal and special (each 
characterized by a different transfer function). 

3. The normal mode is maintained at position D, and at position 2 provided it has 
been entered from position D, or it is the initial gear position. 

4. The special mode is maintained at position 1, and at position 2 provided it has 
been entered from position 1 . 

We first specify the behaviors of the gear and the throttle processes by separate 
robots. The gear control process is specified as follows. 

robot GearCrl 
control: 
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where: 

observer RPM signals: { rpm-1, rpm-2, rpm-D }, activation: periodic#! OHz. 
process Set-1 activation: once-at-creation, deadline: 100ms. 
process Set-2 activation: once-at-creation, deadline: 100ms. 
process Set-D activation: once-at-creation, deadline: 100ms. 

The observer RPM signals the gear position corresponding to the current RPM. It 
serves to determine the initial gear position. Drive, I, or 2, and the further exchanges. 
Each of the processes Set-1, Set-2, and Set-D, integrates the gear into the 
corresponding position. They are activated just once upon the entrance to the 
corresponding state. This is a special form of a sporadic event that is specified by the 
value once-at-creation used in the activation field. 
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Next, the throttle behavior is specified as follows. 

robot ThrottleCrl 

host-events: { low, drive } 
control: 




where 

process NormalCrl 

process SpecialCrl activation: periodic#50Hz. 

The interface field “host-events” indicates events that are expected to be signaled by 
any robot that will host ThrottleCrl (thus no observer will be allocated within the robot 
in order to detect them). A definition of this kind allows us to employ robots as library 
components (the events are instantiated by the application that uses the robot). 

Given the specifications of the robots ThrottleCrl and GearCrl, their composition is 
specified by the robot SpeedControl as follows. 

robot SpeedControl 

composition: { ThrottleCrl, GearCrl } 
where: ThrottleCrl. low ^ GearCrl. rpm-1, 

ThrottleCrl. drive <- GearCrl. rpm-D. 

The terms of the form ‘"event! ^ event 1 ’ ’ specify that every signal of eventl is also 
interpreted as a signal of the eventl (events in a composition are represented in the 
form: robot-id.event-id, where robot-id denotes the robot in which the event is 
declared). This way we synchronize between events of the robots that participate in 
the composition. 

Semantically, a composition of robots is equivalent to a single robot whose control 
task is represented by the AND-composition of the statecharts of the individual 
robots, and every transition labeled by eventl is replaced by a transition labeled by 
eventl/eventl, for every declaration “eventl <— eventl” . 

From a software engineering perspective the notation employed in Robots has the 
advantage of loosing the coupling between the synchronizing processes, since the 
connection is made only at the level where the processes are actually composed (thus, 
each process can be specified independently). In contrast, in Statecharts the signaling 
task must “know” that another task is synchronizing on its events; moreover, it must 
know the identification of the event in the synchronized task. 
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2.5 Completing the ACC Specification 

We complete the specification of the ACC with the refinement of the process 
NormalCrl, as follows. 



robot NormalCrl 
control: 
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where: 

observer CmdLever signals: { maintain, inc-cmd, dec-cmd}, 
activation: periodic#5Hz. 
process SetSpeedCmd signals: {done}, 

activation: once-at-creation, deadline: 50ms. 
process IncSpeedCmd activation: periodic#25Hz. 
process DecSpeedCmd activation: periodic#25Hz. 
process ThrottleCrILoop activation: periodic#20Hz. 



This robot starts operating in state Init where the process SetSpeedCmd is activated 
just once (it initializes the speed command to be maintained by the ACC to the current 
car speed). The exit from state Init is caused by the event done that is signaled by 
SetSpeedCmd upon its termination (see Section 2.4). Termination events need not 
allocated observers in order to be detected (they are handled by the operating system, 
see the sub-section of execution environment in Section 3.2). 

The transition events, identified by CmdLever denote the positions’ exchange of the 
command-lever. The processes IncSpeedCmd and DecSpeedCmd modify the speed 
command in open loop, and ThrottleCrILoop performs the actual control operation. 
Each is an executable task associated with a corresponding computation 
(ThrottleCrILoop, for instance, reads the current speed, compares it with the speed 
command, feeds the error into a transfer function, and outputs the computed rate 
command to the throttle). 



Fig. 4 presents the complete tree-structure of tasks that implements the ACC system 
(rectangles denote control tasks, circles denote executable servers. Every control task 
in the hierarchy together with its immediate subordinates constructs a robot (parallel 
horizontal lines denote composition). A Robots specification of a system completely 
defines the computations of the control tasks. The computations of the servers are 
constructed of an implicit control part that implements the activation conditions, and a 
data processing part that does not employ any synchronization constructs (see Section 
3.2). 
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Fig. 4. The ACC structure of robots 



2.6 The Missing Part: Asynchronous Interaction 

The architectural style of Robots represents the synchronized interaction among the 
components of a software system. Yet, the executable servers of a Robots 
specification may maintain asynchronous interaction through shared memory. 

In general, the asynchronous and synchronous interconnection schemes of a system 
do not follow the same pattern and hierarchical structure. For example, the speed 
command is set by the tasks SetSpeedCmd, IncSpeedCmd, and DecSpeedCmd, and 
consumed by the tasks ThrottleCrILoop and SpecialCrl (Fig. 5). 




Fig. 5 Asynchronous data flow (following the specification of CruiseControl). 

Hence, a complete specification of system architecture must include also the 
asynchronous data flow between the executable servers identified along the 
development of the Robots real-time architecture. (The control tasks can not 
participate in asynchronous data flow since, by definition, they maintain only 
synchronous interaction.) In Robots, we use simple, pure, data flow diagrams (like the 
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one presented in Fig. 5). Due to space constraints we avoid their explicit description 
in this paper. 

Given an asynchronous data flow scheme, it is important to resolve consistency and 
completeness questions regarding the feasibility and correctness of interactions. For 
example: 

• Could it be that IncSpeedCmd or DecSpeedCmd is concurrently active with 
ThrottleCrILoop or SpecialCrl? (in which case mutual exclusion is required). 

• Is there a situation where SetSpeedCmd and IncSpeedCmd, or 
DecSpeedCmd try to set speed-command concurrently? 

• Could it be that SpecialCrl tries to read speed-command before SetSpeedCmd 
has set it? 

• Is there a possibility of deadlock? 

Due to the clear separation of the synchronous and asynchronous specifications, such 
questions can be answered in Robots by mechanized model checking procedures (a 
goal for future development, see Section 5). 

3 Semantics of Robots 

In this section we describe, informally, the semantics of Robots. The semantics 
consists of two parts. First we define semantic rules that express the structure and 
behavior of the architectural style presented in this paper. These rules are checked 
automatically at compilation time. The second part presents the behavioral semantics 
of a specification in Robots by translation into a corresponding statechart. 

3.1 Semantics Rules 

Given a specification in Robots that consists of a finite set of robots, the semantic 
rules specify consistency and completeness requirements that must hold among the 
components of every single robot in the set, and relations over the entire set. For 
instance, for every single robot it is required that: 

• the events signaled by a robot are all generated by its control task, 

• every server declared in a robot is defined to be active in some state of the 
control task, 

• if an observer is active in a state then the events it might signal are expected by 
the control task at that state, 

• every event expected by the control task in a state is either signaled by an 
observer that is active in that state, or is declared as a host event of the robot. 

Similarly, there are rules that apply to the inter-robots relations. For instance, 

• there is a single root robot (not a refinement of any other robot in the set), 

• there are no circular refinement, 

• every server that is to-be -refined has a single refinement, 

• and every server is exclusively employed in a single state at a time. 
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3.2 Operational Semantics 

The operational semantics of a specification in Robots is given by a translation into a 
statechart that is interpreted according to the synchronous semantics given by Harel 
[7], First we define behavioral models for control and server tasks, and provide their 
statecharts representation. Then, we describe the composition of the statecharts 
obtained from every single task, into a coherent system statechart. 

Basic Model of a Task Behavior. There are two basic approaches, asynchronous vs. 
synchronous, for modeling the behavior of tasks. In the asynchronous model,^ 
concurrent tasks execute independently, each progressing at its own pace (in 
particular, the program execution takes non-zero duration). Events are transferred 
either asynchronously (through mailboxes), or by handshaking (occurs only when a 
pair of tasks are willing to communicate at both sides of a channel). 

Formally, the behavior of asynchronous tasks is represented by the statechart of 
Fig. 6. Accordingly, all tasks are initially inactive in which case they are not eligible 
for execution. A task becomes active by a create action,^ and returns to be inactive by 
a delete action. 




Fig. 6. An asynchronous task behavior 

A task in state Active alternates between the sub-states of Running and 
AwaitingEvent. It starts executing in state Running where the task program can 
perform normal data processing operations, create and delete actions, and the wait 
and signal actions. A wait action specifies an event that must be awaited before the 
program is allowed to continue the normal execution. Thus, its execution transfers the 
task into the state AwaitingEvent where it is suspended until the specified event 
occurs. A signal action has a system-wide effect as it triggers every task that is 
waiting for the specified event.'* A triggered task returns to its Running state, and 
continues the program execution. 

Alternatively, in a synchronous execution model [2] all tasks share a uniform view 
of the present events, and the triggered tasks execute their programs simultaneously 
and instantaneously (take no time). Events signaled by computations become part of 
the input set, possibly triggering chains of reactions in concurrent tasks. An execution 
phase of a task terminates either by normal termination of the task program, or by 



^ E.g., commercial real-time operating systems, and reactive languages like ADA, and CSP[9]. 
^ Though, every system necessarily employs a “main” task that is implicitly created at startup. 
We ignore queues where with each occurrence of an event only one selected task is triggered. 
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executing a wait instruction such that the awaited event has not been signaled in the 
current execution phase. 

In practice, synchronous systems consider the external events only at the ticks 
generated by a master clock at a fixed rate, and it is assumed that all computations 
triggered in a tick terminate before the next tick occurs (e.g., Statecharts[7], 
ESTEREL[2], EUSTRE[6]). 

The motivation for introducing the synchronous model has been the behavioral 
non-determinism of the asynchronous model. Specifically, in the asynchronous model 
there is no bound on the duration between the occurrence and the response to an 
event. Thus, it is impossible to carry out formal verification of real-time properties. 
The synchronous model indeed overcomes this problem, however it is questionable to 
what extent the synchrony hypothesis (namely the possibility of instantaneous 
response) is reasonable. 

The Task Behavioral Model of Robots. The idea underlying the behavioral model 
of tasks in Robots is to make complete separation between the reactive and functional 
aspects of real-time programs, and to interpret them in synchronous and asynchronous 
execution models, respectively. Thus, in addition to the natural simplicity gained by 
the separation of concerns, we achieve the capability of formal specification and 
verification of real-time properties while retaining the practical implementation of 
functional computations. 

In practice, the desired separation is obtained by employing two types of tasks: 
control and data-processing. Control tasks are synchronous programs whose 
operation is restricted to the form that can be represented by sequential statecharts (no 
AND states) such that the only action that may be executed in state-transitions is that 
of signaling an event. 

The computation of a data-processing task consists of only normal processing 
instructions (including asynchronous EO operations). It is assumed that the 
computation always terminates and returns a value that is signaled to the environment 
(by default the returned value is done, indicating the completion of the computation). 
A data-processing task is executed asynchronously in response to specific activation 
events, and its execution is restricted by a hard deadline (the real-time requirement). 
Any deadline violation would be considered an execution failure. The activation 
conditions of a data-processing task are not specified in the task computation, but 
handled by an associated control task with the following behavior. 
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Fig. 7 : The behavioral model of data-processing tasks 






Robots: A Real-Time Systems Architectural Style 71 



Accordingly, the control task of a data-processing task consists of N ordered copies of 
the state Active, where N is the deadline duration expressed in ticks. A copy i enters 
the state Running whenever the activation event occurs and the copy i- / is already in 
state Running. Thus, every occurrence of the activation event is responded even if it 
occurs while the task is executing (note that the number of simultaneous activations 
of a data processing task is bounded by its deadline). Furthermore, it has been shown 
[5] that this model gives rise to a decidable reasoning theory with respect to real-time 
requirements of asynchronous tasks. 

Statechart Representation. Given a specification in Robots that satisfies the 
semantic rules, we create the corresponding tree whose leaves denote executable 
servers (actually data-processing tasks), and the intermediate nodes are control tasks 
located in the tree according to the hierarchy dictated by the refinement and 
composition relations (see example in Fig. 4). With every node in the tree we 
associate a statechart as follows. Every server node (the leaves) is represented by the 
corresponding data-processing control statechart as defined in Fig. 7. Every control 
task is represented by the statechart (augmented with task-stickers) given in its 
specification. 

Next, we create a compound statechart that represents the specification. We start 
with the leaves, and then recursively replace every task-sticker in the host node with 
the statechart of the corresponding offspring which is composed in an AND manner 
with the state that is labeled by the task-sticker. Thus, a state X that is labeled by N 
task stickers is replaced by an AND-state of N-tl components that include X and the 
N corresponding statecharts of the task stickers. Eor example, the following statechart 
represents the translation of the top-level robot ACC (Section 2.3). 




Due to the hierarchical structure of the specification, we get at the end of the 
process, a normal compound AND-statechart (with no task-stickers). The meaning of 
a Robots specification is given by the set of computations that satisfy the complete 
AND-statechart such that no server reaches its Eail state. 

Execution environment. Our execution environment consists of a synchronous 
executive that runs the control-tasks, and an asynchronous executive that runs 
concurrently the execution of the activated data-processing tasks. At each time instant 
h, the control executive examines the events that occurred during the period [t;.i,ti) in 
order to identify occurrences of activating events and deadlines expiration (in 
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particular this set includes the events denoting terminations of tasks’ executions 
reported by the asynchronous data-processing-tasks executive, see below). 

The interaction between the executives consists of (possibly overlapping) 
interaction cycles, each representing a single execution of a function. An interaction 
cycle (Fig. 8) starts upon each detection of the activation event by the control 
executive. An activation request that consists of the task name and the activation time 
is issued to the data-processing-tasks-executive, which in response schedules the 
corresponding task and monitors its execution until completion. Every termination of 
a task execution is immediately reported to the control-executive including the 
activation time that was specified in the corresponding activation request. 
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asynchronous 

data-processing-tasks 
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execution of T 



Fig. 8: Robots Execution Environment 

Starting from the next instant after issuing an activation request, the control-executive 
looks for a corresponding termination event (identified by the data-processing-tasks 
executive). If the deadline expires before the expected termination, the whole 
execution is declared to fail. 

4 Related Work 

Robots has been designed to represent a specific real-time architectural style (in 
contrast with general ADLs e.g., Darwin [11]„ Wright [1]), Rapide [10]). Though the 
domain-specific approach means loss of generality, it has the advantages of simpler 
and precise expression of the design, better design comprehension, and support of 
automatic analysis and verification . 

There are several, domain specific, real-time ADLs. For instance, MetaH [3] has a 
rich set of components and connectors’ types (modes, macros, processes, events, 
monitors, packages, etc.), that are otherwise abstracted in Robots by the few 
application oriented constructs (control, observer, process, state, and event). The 
variety of architectural constructs yields complex inter-object and hierarchical 
relations. This fact is reflected in the language semantics that involves several 
mathematical models, different from the synchronous/asynchronous model employed 
in Robots. However, the language enables formal analysis of various real-time 
important aspects, such as: scheduling, reliability, and security, which we hope 
Robots will support in the future. 

Conceptually, the most related to our work is the Process-control paradigm 
suggested by Shaw [12]. This paradigm adopts the closed control loop structure with 
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clear separation of the control and process subsystems. However, in contrast with 
Robots, the process cannot be reconfigured in response to mode transitions taken in 
the control part; these must be reported to the process that will react independently. 
Consequently there is an increased amount of data flow, the process algorithm must 
constantly consider all possible inputs, and part of the control logic must be replicated 
in the process. Hence, the process becomes more complex then really needed. Also, 
the Process-control paradigm does not suggest any hierarchy. 

From a technical point of view. Robots is based on statecharts [8]. The extension 
of states with task-stickers is actually syntactic-sugar that imposes a conceptual 
hierarchy on sets of tasks which otherwise should be represented by a flat structure of 
AND-composed statecharts. We believe the conceptual hierarchy improves to a large 
extent the expressiveness and comprehension of the system design. 

Most of the real-time architectural styles (see Shaw [13] for a good survey) employ 
state machines in order to represent the behavioral aspect of the design. However, 
none takes the behavioral design as the orientation for the structure of the system. 
Also, only few provide formal relations between the functional and behavioral 
specifications. 

Finally, the operational model of Robots that integrates synchronous execution of 
control tasks with the deadline constrained asynchronous execution of data-processing 
tasks has been formally defined and analyzed by the author in previous works [4,5]. It 
is worthwhile mentioning that existing reactive ADLs that support formal reasoning 
are based on asynchronous execution models (e.g., Pi-Calculus in Darwin [11], CSP 
in Wright [1]). Thus, they are suitable for distributed systems but restricted with 
respect to real-time properties specification. 

5 Status and Future Work 

Robots evolved during the development of large-scale real-time systems in MBT.^ 
Statecharts is common specification formalism in MBT, however, until recently it was 
used independently of the software design. At first. Robots was formulated as 
guidelines for the system architecture, to be used with statecharts and various 
operating systems. In quite a short time, it has been adopted by software engineers 
and assimilated in the standard development process. Its salient features emphasized 
by users are: the simplicity, the reflection of the natural design of real-time systems, 
and the ease of the design understanding by new engineers that join the team. 
Encouraged by the user’s response, the language has been formally defined and is 
currently under implementation. 

Future work on Robots is aimed at several directions: 

• Automatic verification, mainly by adoption of existing real-time model- 
checking tools. 

• Identification of design patterns (for instance, we were able to find out that 
Switch, Resume, and Pedals represent the typical behaviors of observers). We 
expect this activity will give rise to object-oriented extension of Robots. 



^ MBT is the systems division in the electronic group of Israel Aircraft Industries. 
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• Extending the specification of activation conditions with the full power of 
MASS (an interval temporal logic based language that is associated with a proof 
system [5]). 
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Abstract. Over the past decade a variety of process languages have been 
defined and applied to software engineering environments. The idea of using 
a process language to encode a software process as a “process model”, and 
enacting this using a process-sensitive environment is now well established. 

Many prototype process-sensitive environments have been developed; but 
their use in earnest has been limited. We are designing a second generation 
process language which is a significant departure from current conventional 
thinking. Firstly a process is viewed as a set of mediated collaborations rather 
than a set of partially ordered activities. Secondly emphasis is given to how 
process models are developed, used, and enhanced over a potentially long 
lifetime. In particular the issue of composing both new and existing model 
fragments is central to our development approach. This paper outlines these 
features, and gives the motivations behind them. It also presents a view of 
process support for software engineering drawing on our decade of experi- 
ence in exploiting a “first generation” process language, and our experience 
in designing and exploiting programming languages. 
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1 Introduction 

There is a long association between process languages and efforts to provide computer 
support for software engineering. Initially there was considerable work which concen- 
trated on providing integrated project support environments to support users working 
together. The emphasis was mostly on tool integration; ensuring that the design tools, 
compilers, debuggers etc. could work together. It became apparent that support envi- 
ronments could make a greater contribution if they had knowledge of the process in 
which their users were involved. This led to further research on process-based environ- 
ments [1,5,7]. The recognition that software processes can themselves be described as 
software is attributed to Osterweil [19], and has led to the development of process pro- 
gramming as part of software engineering, and ongoing research into process-centred 
environments. 

More recently the interest in understanding and designing business processes, in 
particular the vogue for business process re-engineering, has led to an expansion of in- 
terest in process languages outside the software area. In general these have been simple 
languages specialised for an application area. For example, many forms-based work- 
flow systems have a language based on the notion of passing an electronic document 
around an organization [14]. The concentration has been on high-volume processes 
where standards are key to operational efficiency, e.g. car loan credit checking. 

Over the past decade the Informatics Process Group (IPG) at Manchester has devel- 
oped a number of process models using the first-generation language of ICL’s Process- 
Wise Integrator (PWI) [20,24] and more recently ProcessWeb [9,29] which combines 
PWI with a Web interface. We believe that our experience has been typical. First-gen- 
eration languages, and the systems which evaluate them to provide process systems, are 
technically feasible and promising [7] . However the costs of developing process models 
are too high. Too great a knowledge of the language implementation is needed to devel- 
op effective models, and the code, when completed, tends to give a somewhat obscure 
representation of the process flow. This is partly because many first-generation lan- 
guages have adopted specific implementations, particularly with respect to modelling 
collaboration, which have restricted their range of applicability. 

Research in the Persistent Programming Research Group (PPRG) at St Andrews 
has tackled the problems of constructing and maintaining large, long-lived application 
systems, including software development environments [16,18]. Many of these prob- 
lems are closely related to process language issues and therefore the techniques devel- 
oped can be refined and incorporated in a process language. The two techniques dis- 
cussed in this paper are Communicating Actions Control System (CACS) [22] and hy- 
per-programming [13,17]. CACS addresses the problem of providing flexible 
concurrency control mechanisms to support collaborative working. Hyper-program- 
ming provides a novel approach to developing long-lived systems through allowing 
new code to be not just text but both text and explicit links to existing code and data. 

Together the IPG and PPRG are now developing a second-generation language 
based on a synthesis of our joint experiences [26]. We see process models as fulfilling 
a key role in modern computer systems; that of relating the business processes and the 
IT systems which support them [27]. Understanding this relationship is growing in im- 




Collaboration and Composition: Issues for a Secound Generation Processes Language 77 

portance as systems are increasingly knitted together from existing components, and 
must have the flexibility to adapt in response to business changes. Automation involves 
introducing new IT systems into existing systems and changing the business processes 
to exploit them. This has led us to place emphasis on how our second generation lan- 
guage represents collaboration, and how models can be developed by composing com- 
ponents. 

In designing a process language we are addressing the problems which we have ex- 
perienced with first generation languages. We also have a view of the kind of process 
system which we want to express in our new language which derives from our work on 
process modelling methods. 

2 Motivation - A Process System 

A number of process systems which execute programs in process languages have been 
developed. These systems have been strongly influenced by factors such as tool invo- 
cation, visualisation, and meta-processes which do not figure so highly in traditional 
program language design. There are a number of contributing factors: 

• The contribution of people is part of the process. A process system can support 
the people involved by ensuring that the right information is in the right place, at 
the right time. However, people are biddable rather than controllable by the proc- 
ess system. For example, if input is requested there is no control over how long a 
user will take to respond. 

• Processes can last for a long time. Software projects may last for weeks, months 
or even years. To support such processes, process systems must themselves last 
just as long. 

• Process models are developed over a period of time. Early parts of a software de- 
velopment project often involve investigating alternatives and making decisions 
about the future course of the project. To support this a process model cannot be 
fully defined when its enactment starts. A process system must support the incre- 
mental development of models and provide facilities which bind new and existing 
model fragments. 

The general scheme adopted by most process languages is to model the process as a pro- 
duction system. The software product is the output of the software development proc- 
ess. This process is usually described in terms of a set of partially ordered tasks (activ- 
ities) with output to input connections between them. This leads to a factory view where 
there is a “master process program” which provides the instructions to keep everything 
running efficiently; the emphasis is on design for efficiency. 

Our preference, based on experience in modelling processes both inside and outside 
the software domain, is to view the process system as a service system. The purpose of 
the process system is to provide effective assistance to its users. Software development 
requires the collaboration of many people over a period of time. A process system can 
provide information to people, ensure that they do not mistakenly work on incorrect, or 
incomplete, data, and it can carry out some routine, mundane activities on behalf of its 
users. From a systems theory perspective, the process system is a serving system which 
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supports its users, a served system [3]. To be viable a process system must continue to 
support the changing requirements of its users; design for evolution is a key theme [25]. 
This implies a need to separate existing models into fragments which can be composed 
in new ways to address future requirements. 

For a process system to support the collaboration requirements of its users, its proc- 
ess language must be able to express a rich range of collaborations. A process language 
needs to provide high-level concepts which offer flexibility in modelling and enable ef- 
fective and efficient support systems to be developed. 

3 Collaboration 

Our experience, from prototype systems, industrial case studies, and developing a proc- 
ess modelling method, is that a focus on collaboration is an effective way of modelling 
processes [8,12,27]. Unfortunately whilst many first-generation languages offer good 
abstractions for activity, they offer only low-level communication mechanisms, without 
any real abstractions, with which to realistically model coordination [10]. 

In PWT s Process Management Language (PML) this collaboration is modelled in 
terms of explicit message passing using typed buffers called interactions [2,11]. Our 
second generation approach is to adopt a more abstract general view of collaboration as 
mediated access to shared data. The specific implementation of message passing in 
PWTs PML interactions can then be defined using this more abstract view. However 
many other forms of collaboration can also be defined. 

3.1 A Small Example 

Consider the case of a sub-process where there is one software engineer who writes or 
updates a module, and another who must check it. One implementation of this involves 
a “module writing” activity which results in a revised module delivered to a “module 
checking” activity. This latter activity either outputs a “checked module” for input to 
the next sub-process or delivers the module and comments back to “module writing”. 
This implementation means that the module checker cannot start until the module writer 
has finished writing, and once the writer has submitted the module for checking, further 
module writing must wait until the checker finishes. In many cases, however, this is not 
how work progresses in reality; the module writer may deliver a draft version for check- 
ing and continue writing; a checker’s comments may include updates to the module. 
What matters is that the two people involved have an agreed protocol for manipulating 
the module, and deciding when their tasks are complete. 

Figure 1 provides a sketch of this small example. The overlap between “module 
writing” and “module checking” shows that these components collaborate. A collabo- 
ration, which is in essence shared behaviour, is defined by identifying the shared data 
involved and the rules for accessing it. Our approach to defining these sharing rules is 
outlined in the next section. A key part of this is the separation of the collaboration pro- 
tocol from the details of a specific process. In this example there is a “writing-checking” 
collaboration protocol which is independent of the organizational rules on how modules 
should be written, and how they should be checked. We may want to reuse this protocol 
in the context of writing and checking design documents, test plans, user manuals etc. 
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Figure 1 A small process showing the collaboration between two 

components 

Similarly, we may want to change the details of how a module is checked without mak- 
ing changes to the collaboration protocol. This separation is one example of the kind of 
structuring which needs to be applied to process models in order to maintain their clar- 
ity. 

We have written a series of related writer-checker processes in PWI’s PML [2,1 1], 
Java, and Little-JIL [21,28]. In these implementations, the collaboration protocol was a 
significant proportion of the effort, and it became closely interwoven with the rest of 
the process. This restricted our ability to reuse fragments between related processes; our 
library of re-usable process components just did not grow as we had envisaged. 

3.2 Communicating Actions Control System (CACS) 

Communicating Actions Control System (CACS) is an abstract operational model de- 
signed to allow the specification of coherency protocols for accessing shared data [22] . 
The CACS model consists of actions (computations) that access objects (shared data). 

A particular coherency protocol, for example atomic (ACID) transactions, is de- 
fined by a set of significant events and a set of rules. The significant events specify the 
operations on shared data that need to be coordinated by the protocol. The rules specify 
the details of this coordination. For example, consider the atomic transaction protocol: 

• The significant events are: begin, commit, abort, read and write. 

• The rules give an operational specification of how the ACID transaction proper- 
ties are to be enforced. 

As it runs, each action generates a sequence of significant events, which are handled by 
the CACS controller, according to the rules. Each rule specifies what the controller 
should do in response to a particular significant event. In addition to performing arbi- 
trary computation, the controller may suspend and resume actions, and generate addi- 
tional, synthetic events that are added to the incoming event stream. 
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let writerule [write , %id, %v] = 

if there is an unread write of shared data item %v 

then suspend %id on write %v 

else 

begin 

update %v 

if there is a read %id2 suspended on write %v 
do unsuspend %id2 

end 

Figure 2 Example CACS Rule - write is the CACS event, %id the 
CACS action identifier and %v is the shared data item manipulated 

This architecture gives considerable flexibility both in defining new coherency pro- 
tocols, and in varying the policy implementing a particular protocol. For example, both 
optimistic and pessimistic flavours of atomic transactions could be defined in a similar 
way. The significant events would be the same in each case, but the bodies of the rules 
would vary. For an optimistic scheme the controller would allow actions to read and 
write shared data without restriction, recording which data objects had been accessed. 
On a commit event, the controller would test for conflict with other transactions, and 
generate an abort event for each conflicting transaction. For pessimistic transactions the 
controller would suspend an action on the first attempted conflicting access to shared 
data. 

In general it is not possible for the CACS system to deduce automatically the points 
in an action at which significant events are generated. The source program must thus be 
annotated to indicate these. In some special cases, however, this may be done automat- 
ically. For example with atomic transactions the system may deduce where read and 
write events occur, but not begin, commit or abort. 

CACS specifications, in terms of events and rules, can be written for a wide variety 
of coherency protocols. These include traditional schemes such as atomic transactions 
and monitors, and more complex application-specific schemes. The programming in- 
volved in the correct implementation of these schemes can be defined and placed in a 
library for reuse. The writer of a CACS action, a process computation in our case, thus 
does not have to write CACS specifications in cases where standard coherency proto- 
cols are sufficient, but has the flexibility to define new schemes if required. For exam- 
ple, it is possible to implement CACS rules to give the particular message-passing se- 
mantics which are currently offered by interactions in PWI’s PML. 

In the writer-checker example, one useful coherency protocol is that offered by a 
single element buffer. A data item must be written before it is read. Once written, the 
“buffer” is full and the item must be read before it can be written again. In CACS terms, 
a read event may suspend because it must wait for the corresponding write event, and a 
write event may suspend until the value from the previous write has been read. There 
would be two CACS rules, one handling read events and one handling write events. 
Figure 2 illustrates pseudo-code for the write rule. 
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Checker 

shared data - checksd, cuisd 
let chModule := nullModule 
let chComments := nullComments 
while checkernotf inished ( ) do 
begin 

chModule := checksd . module 

! CACS event - [read, my_aid, checksd . module] 
cuisd. module := chModule 

! CACS event - [write , my_aid, cuisd . module] 
chComments := cuisd. comnts 

! CACS event - [read, my_aid, cuisd . comnts] 
checksd . comnts := chComments 

! CACS event - [write , my_aid, checksd . comnts] 

end 

Figure 3 Example code for Checker (including annotations identifying 

CACS events) 

One of the benefits of the CACS approach is that it does not restrict the number of 
computations which access a specific shared data area. This means that collaborations 
involving more than two parties can be modelled more naturally than with a message 
passing style. This was a particularly unpleasant problem to implement using the mes- 
sage-passing semantics of PWTs PML interactions. 

An example checker is given in Figure 3, corresponding to the checker action 
shown in Figure 4. As the shared data is explicitly identified, we can anticipate that the 
compiler will be able to generate the CACS events, here shown in comments. The 
shared data, checksd, is shared between checker and writer. It contains two fields, mod- 
ule and comnts. This checker simply loops through four steps. The first step is to read 
the module from the shared data, checksd, and assign it to checker’s local variable, 
chModule. It is at this point that checker will suspend if the module is not available. The 
second step makes the module available to the checker user, and the third step reads the 
comments from the user when they are available. The fourth step is to write those com- 
ments to the shared data, checksd, and so make them available to the writer. The func- 
tion checkemotfinished returns a boolean value which is true when the process is com- 
plete. If there is only one checker this is simply when the comments indicate the module 
is acceptable, but with multiple checkers this can be more complicated. 

3.3 User and Tool Interaction as Collaboration 

Figure 3 is code which will be executed by the process system to support the user re- 
sponsible for the checking. It makes the module available to this user by writing to a 
shared data area. Figure 4 depicts an overview of the module example showing the two 
users. Over time the boundary between what is done by the system and what is done by 
the users may change. For example, if checking only involved ensuring that the module 
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User A User B 

(Writer) (Checker) 



Figure 4 User Interaction as Collaboration 

compiled and conformed to coding standards, then the organization might decide to in- 
vest in tools to automate this. In our process language we want to be able to model such 
automation. This gives us a requirement that collaboration with the external world, both 
users and external tools, should be handled in the same way as collaboration within the 
process system. This is represented as the manipulation of data shared between a com- 
putation within the process system and an “external computation” carried out by a user 
or another software tool. 

A process model in our system is represented by a persistent, strongly typed collec- 
tion of data and programs held within the system, interacting with other software and 
users outside the system. The hyper-program, to be described in the next section, is a 
representation which provides a single consistent description format for everything 
within the system. Input/output then involves a translation between the strongly typed 
internal form and other external forms. For interaction with a user these external forms 
might be X-Windows events or HTTP streams; for interaction with other tools they 
might be raw bytes, database relations etc. 

A particular sequence of Input/Output can itself be viewed as a collaboration be- 
tween the outside world and an action within the process model that implements the re- 
quired translation. CACS rules can thus be used to coordinate the Input/Output. Again, 
a library of pre-defined rule sets for standard Input/Output patterns can be provided. 
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User A 
(Writer) 

Figure 5 Automation through moving the process system boundary 

4 Composition 

A significant motive for our focus on collaboration was the kind of process systems 
which our experience tells us it is desirable to build. Similarly, the way we build such 
systems is a significant motive for our composition focus. In this context there is a need 
to address not only the issue of how components are composed, but also the facilities to 
decompose assemblies of components and re-compose them in new ways. 

Our view is that a model developer should be thinking about composing fragments, 
rather than writing a model from scratch. This gives a close mapping between the way 
that a model is understood as a set of mediated collaborations, and the way that it is de- 
veloped. One motivation for this is to encourage re-use of existing model fragments as 
a standard part of model development. A second motivation stems from the potential 
longevity of our process models. 

Figure 5 depicts the automation of the checking in our example through replacing 
“User B” with a computation within our process system. If the “automated checking 
tool” uses the same shared data and protocol as “User B” there is no need for any chang- 
es to “checker”. If we had replaced “User B” with an external tool, the diagram would 
have been the same as Figure 4 with “User B” re-labelled “external checking tool”. 
Again our focus would have been on the collaboration between this tool and “checker”. 

The examples in Figure 4 and Figure 5 show a close relationship between model- 
ling based around collaboration and a modelling method in which composition plays a 
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strong role. The intuition from the figures is that in both cases “writer” and “checker” 
will be composed in the same way. A typical concrete approach to sharing such as in- 
teractions, also defines, by implication, a composition approach. Our abstract sharing 
model does not. We need to have composition facilities to enable us to construct spe- 
cific sharing implementations in the language. 

In the context of our small example, there are many possible enhancements to a 
model which supports a single writer and single checker. In some situations it may be 
better to have two checkers who must both agree that the module is acceptable. This 
could be made visible to the writer, or it could be hidden so that the writer collaborates 
with one, two, or more checkers in exactly the same fashion. With more than two check- 
ers there might be some kind of voting algorithm to determine if the module was accept- 
able. Writing and checking might form part of a larger quality control process, such as 
Fagan inspections. Our goal in using CACS is to allow the programmer to carry out 
these kinds of adaptations as easily as possible. 

4.1 Hyper-Programming for Model Development 

The technique which we adopt is to base our model development on hyper-program- 
ming [13,17]. Traditionally, a program which accesses another potentially shared object 
during its execution, contains a textual description of how to locate the object. This de- 
scription is subsequently resolved, commonly during linking for code objects and dur- 
ing execution for data objects. This resolution is necessary because programs are con- 
structed and stored in some long-term storage facility, such as a file system, which is 
separate from the run-time environment which disappears at the end of each program’ s 
execution. By contrast in a persistent programming system, programs may be construct- 
ed and stored in the same environment as that in which they will subsequently be exe- 
cuted. This means that objects accessed by a program may already exist when the pro- 
gram is composed. Direct links to the objects can be included in the program rather than 
textual descriptions of the access paths by which they can be located at execution time. 
A program containing both text and links to objects is a hyper-program. 

There are a number of benefits of hyper-programming [13]. 

• early checking - access path checking and type checking for linked components 
can be performed during program construction, 

• associations between executable programs and their source programs can be 
maintained automatically, 

• source representations of all programs, including those that may, due to the con- 
text in which they were defined, contain references to other existing data - it is 
difficult to fully describe such programs with a purely textual notation. 

These benefits are all relevant in our context of a process language and system which 
supports model development through composition and incremental evolution. The abil- 
ity to include explicit links to existing objects closely matches development through 
composition. Part of a modeller’s task is to describe how existing fragments (objects) 
should be assembled to produce the new model required. The ability to relate executable 
and source descriptions means that process model instances and their source descrip- 
tions can be related over their lifetime rather than just at instantiation time. The ability 
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to have hyper-program representations for both source and run-time makes incremental 
definition of long-lived models significantly simpler. 

4.2 Returning to our Small Example 

Hyper-programming is very helpful in enabling a development by composition ap- 
proach. We can develop the components separately and then use hyper-links to make 
the appropriate connections. We can also unlink and re-link as required. If we consider 
the writer-checker example there are several possibilities for components being devel- 
oped at different times. CACS specifications for common coherency protocols are 
available in a library when the writer-checker model is being developed. Hyper-pro- 
gramming enables the use of these in writer-checker to be checked at compile time. 
Simple mistakes can be eliminated when the model is being developed rather than only 
becoming evident at run-time. Another process might want to use the checked module 
once writer-checker is complete. A compositional development approach, supported by 
hyper-program links to existing fragments (objects), is well suited to developing mod- 
els and libraries over a period of time. 

In most first generation process languages the traditional development sequence is: 
write a process definition; specialise the definition into an enactable process; and spark 
the enactable process to yield an enacting process [6]. This means that a single process 
definition can be specialised and sparked many times to give separate enacting process- 
es. The distinction between the definition (source) and enacting (run-time) representa- 
tions makes it more difficult to compose new definitions with fragments of existing en- 
acting processes. In progressing from the definitions of the separate components in 
writer-checker to an enacting model, which is supporting two users writing and check- 
ing a particular module, we make use of the ability of hyper-programs to provide both 
source and run-time representations. When writer and checker are developed they are 
defined as scripts. A script corresponds to a CACS action. It records business rules in 
terms of the sequencing of operations which manipulate local script data and mediated 
shared data. (For example Checker in Figure 3.) A script is a piece of code and suspend- 
ed thread. When defined it is suspended at the start of its execution. The hyper-program 
which is compiled to produce a script is also the hyper-program which represents the 
script in its initial suspended state. The system has a built in function activate which is 
used to spark a collection of scripts, returning an activity. (An activity thus corresponds 
to a set of CACS actions, along with associated shared data and rules.) 

Figure 6 illustrates activate sparking the scripts in the writer-checker example. The 
activate takes two parameters: a set of scripts and a set of rules. Thus the activate would 
be passed the Checker script code from Figure 3 and the writerule from Figure 2. The 
scripts mention the CACS events, while the rules provide the implementation of the 
events in a particular concurrency protocol. For example, in a case where the scripts 
used a transaction protocol there might be one set of rules which implemented optimis- 
tic locking and another set which implemented pessimistic locking. If we want to have 
multiple instances of writer-checker then we can write a constructor function which re- 
turns an activity, and call it as often as required. 
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Figure 6 Using activate to convert scripts into an enacting process 
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! suspend activity and extract a script 
suspendedwc := decompose ( wcactivity ) ; 
suspendedChecker : = 

suspendedwc . scriptvector [checkerindex] ; 

! revise script and replace in suspended activity 
revisedChecker := changefunction ( suspendedChecker ) ; 
suspendedwc . scriptvector [checkerindex] := revisedChecker; 

! restart the suspended activity 
wcactivity := activate! suspendedwc ) ; 

Figure 7 An example of stopping, modifying and restarting an activity 

using decompose 

There is also a decompose function which takes an activity and returns the scripts 
in a suspended state. An activate following a decompose restarts the scripts from their 
suspension point as if the decompose had never occurred. It is also possible to use de- 
compose and activate to dynamically compose or change enacting models. It may be 
that after decomposition some new scripts are added to the collection before it is acti- 
vated, or individual scripts might be replaced, as shown in Figure 7. 

5 Related Work 

There are different schools of thought about the appropriate current research goals in 
the area of process languages. There are those who see new languages as an irrelevance 
which create artificial barriers between the research community and industrial practi- 
tioners. They place emphasis on exploiting and inter-operating with existing tools such 
as configuration management systems, workflow systems, object request brokers etc. 
[4]. This work is important but this does not mean that there is no need for further re- 
search on process languages. It has been noted that despite the considerable research in 
process languages, almost no novel approaches to software development have emerged. 
This suggests that the first-generation languages have been overly constrained by the 
emphasis on describing, promoting and supporting existing software processes. It also 
supports our thesis that many of these languages made an early commitment to partic- 
ular mechanisms, which both make them difficult for inexpert modellers to use, and 
make it clumsy, or impossible, to represent some forms of collaboration. 

The “second generation” school advocate that the lessons learned from first gener- 
ation languages now need to be consolidated and exploited through the development of 
new process languages. Here the chief difference seems to be between those who be- 
lieve the most promising approach is a set of sub-languages which can be factored to- 
gether as and when required [23], and those like ourselves who are concentrating on a 
better core language [15]. One common theme is the issue of managing concurrency. 
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Little- JIL [21] is a sub-language of the second generation process language JIL [23] 
which concentrates on the coordination of activities and agents. It has a visual syntax 
and is aimed at making it easier for practitioners to experiment with process programs. 
This emphasis on coordination is closely related to our view of processes as sets of me- 
diated collaborations. In both it is recognised that an important part of managing con- 
currency is expressing how shared resources are handled. Another common theme is the 
importance of well-defined semantics to enable reasoning about processes written in 
second generation languages. This is one area in which our process modelling method 
[27] needs further improvement. 

6 Conclusions 

Interest in understanding processes and how we can apply computer systems to support 
them continues to drive the development and use of process languages. In general mod- 
elling software engineering processes has turned out to be particularly challenging and 
has therefore often acted as the driver in process language design. However, process 
languages should not be evaluated just on how accurately they can reflect established 
best practice in software engineering, producing traditional applications in traditional 
ways. There is a need to support rich collaborative protocols between a process system 
and users, and between a process system and other software tools. Viewing a process as 
a set of mediated collaborations is clearly a very general approach, within which partic- 
ular collaboration styles can be defined. A common approach to collaboration whether 
within the process system or across its boundaries supports a clean and general ap- 
proach to modelling process automation. 

As processes may be developed incrementally over a period of time, the models to 
support them must be incrementally developed too. The process language and system 
must provide facilities which combine new process model fragments and existing en- 
acting models. A collaborative viewpoint matches well with an approach based on ge- 
neric composition facilities. These can be used to recombine existing process model 
fragments, and further to retain the composition structure to assist future understanding 
and change. 

In developing a second generation process language we have been motivated by the 
problems in using existing languages, and a recognition of the key role of process lan- 
guages in modern flexible architectures. We need a better understanding of process lan- 
guages in order to meet the next challenge of developing appropriate architectures for 
the support and integration of process systems. By exploiting CACS and hyper-pro- 
gramming we are able to solve the issues of collaboration and composition in the style 
shown. This gives us an abstract machine which addresses collaboration and composi- 
tion at a basic level and can be used as the target for other higher-level process repre- 
sentations.' 
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Abstract. We examine the benefits of using an object-oriented model- 
ing language for software process modeling. We show how the Unified 
Modeling Language (UML) can be used to model software processes 
based on dynamic task nets, which evolve continuously during enact- 
ment. We have selected UML for various reasons: it is wide-spread, pro- 
vides a comprehensive set of diagrams for both structural and behavioral 
modeling, and supports the early phases of process modeling (analysis 
and design). 

Like many other object-oriented modeling languages, UML has no well- 
defined semantics. We indicate how a process model described in UML 
can be automatically transformed into an executable form, i.e., we pro- 
vide dynamic semantics for UML models. To this end, UML models are 
transformed into programmed graph rewriting systems which are used 
to drive a process management environment. 

Keywords: Software Process Models, Software Engineering Tools and 
Environments 



1 Introduction 

Software processes are highly dynamic. Many changes have to be taken into 
account while a software process is being executed: changing requirements, feed- 
back to earlier stages of the software life cycle, moved deadlines, shrinking bud- 
get, etc. These changes challenge the capabilities of process-centered environ- 
ments [7]. 

For modeling software processes, we have proposed dynamic task nets [9], i.e., 
hierarchies of tasks that are in addition connected by various kinds of horizontal 
relationships (control flow, data flow, and feedback). The most essential feature 
of these task nets is that they continuously evolve during the enactment of a 
software process. This contrasts sharply to the distinction between build time 
and run time, as it is made in most workflow management systems [16]. 

Originally, dynamic task nets were defined without any reference to an object- 
oriented modeling approach. However, the continuous evolution of task nets 
makes an object-oriented approach particularly attractive. Therefore, we have 
decided to examine the benefits of using an object-oriented modeling language 
for software process modeling. For this purpose, the Unified Modeling Language 
[3] appeared to be a natural choice. Some of the benefits we expected include: 
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— If a wide-spread notation is used, process models can be communicated more 
easily to a larger number of people. 

— UML provides a large set of diagrams (class diagrams, object diagrams, 
collaboration diagrams, state diagrams, etc.) which can be used to define 
both structure and behavior of dynamic software processes. 

— Object-oriented modeling supports the earlier phases of process engineering 
(analysis and design), while most process modeling approaches, in particular 
those underlying process-centered environments, primarily focus on process 
programming. 

On the other hand, we also expected some problems, in particular because 
UML is an informal modeling language which does not have a well-defined (dy- 
namic) semantics. 

In this paper, we describe how we are using UML for modeling software pro- 
cesses based on dynamic task nets. Section 2 summarizes the main features of 
dynamic task nets. Section 3 introduces the main components of the DYNA- 
MITE process management environment. Section 4 constitutes the main part of 
this paper, which is devoted to process modeling in UML. In Section 5, we de- 
fine the semantics of UML process models by a mapping into a graph rewriting 
system. Section 6 summarizes the lessons learned. Related work is compared in 
Section 7. Finally, Section 8 presents a short conclusion. 

2 Dynamic Task Nets 

The DYNAMITE model [9] introduces DYNAMIC Task nEts for software process 
management. A task represents a unit of work that is typically performed by a 
human developer (with tool support). A task may have input parameters and 
output parameters, which are the documents the task is working on. A task 
reads its input documents upon activation, works with these documents and 
finally produces some output documents. Tasks are connected by control flow 
relationships which describe the order of execution of tasks. In addition, there 
are data flow relationships which refine control flow relationships. While control 
flow state only the existence of temporal dependencies, data flows describe the 
passing of documents between tasks. 

Figure 1 shows a dynamic task net which models the process of extending a 
software system during maintenance. At the beginning only little is known about 
the process. The request for extending the software system has to be analyzed 
and the application has to be redesigned. Finally, the modified system has to be 
installed. The intermediate structure remains unspecified, because it depends on 
the changes to the design document’s internal structure (part i). According to 
the new design document produced by the task Redesign Application, modules B 
and D have to be changed and a new module C has to be introduced. Thus, the 
project manager introduces two tasks for changing modules B and D together 
with a task for implementing module C. In addition, he adds subsequent test 
tasks. The control flow relationships between the test tasks reflect the module 
hierarchy, i.e., the topmost module D is the last one to be tested (part ii). 
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Fig. 1. Examples of dynamic task nets 



The execution of tasks is modeled by their states. E.g,, a task which is cur- 
rently being executed is in state Active. Note that a task may start execution 
before termination of its predecessors (simultaneous engineering). Tasks which 
wait for their execution are in state Waiting. After a task has produced all of its 
output parameters, its execution is finished and its new state is Done. 

Faulty output documents produced by predecessor tasks may cause errors 
in later phases of the development process. In this case, it is necessary to go 
back to the predecessor task and produce a corrected version of the document. 
This situation is modeled by a feedback flow which is introduced between the 
task where the error was detected and the task which is responsible for it. If the 
predecessor task has already terminated, it is not reactivated. Rather a new ta.sk 
version is derived from the old one and the old work context is reestablished. 
Using derivation rather than reactivation of tasks results in creation of a process 
execution’s trace, which can later be examined by the project manager to find 
indications for process improvement. For example, in part hi of Figure 1 Instal- 
lation raises feedback to Change Module B, resulting in new task versions along 
the path from the feedback’s source to the feedback’s target. 



3 Environment Overview 

The DYNAMITE environment supports human-centered development processes 
through various tools (Figure 2). Humans interact with these tools in three 
different roles: as project manager, technical developer or as process modeler. 
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PROGRES System 



modeling tool 




Fig. 2. Overview of the DYNAMITE environment 



The project management tool provides unrestricted access to a task net. It 
supports a project manager in constructing and analyzing task nets. During 
enactment he is provided with mechanisms and views to guide and monitor 
process performance. Internally, task nets are stored in a graph-based database 
system. 

The technical developers enacting the development process are supported 
by agenda and workspace tools that allow restricted, personalized access to a 
task net. An agenda tool deals with the tasks that have been assigned to a 
developer. It provides information on deadlines, status of tasks, guidelines etc. 
and is a developer’s central access structure to the management environment. 
When working on a specific task, a workspace is automatically provided. It 
gives access to versioned documents relevant for performing the task. Domain- 
specific technical development tools are integrated into the environment and can 
be activated on selected documents provided by the workspace tool. 

For the internal operational definition of dynamic task nets we utilize the 
graph transformation system PROGRES (PROgrammed Graph REwriting Sys- 
tems [24]). Graph transformations are a suitable approach because task nets 
form complex graph data structures and operations to manipulate and enact 
task nets can be specified using a uniform mechanism. This enables intertwined 
editing and enacting of task nets as was presented in Section 2. 

In a PROGRES specification the metaschema of dynamic task nets is defined 
as a graph type consisting of node types, edge types, and attributes. A set of 
generic operations to manipulate and enact task nets is provided as complex 
graph transformations and procedures of these. This generic part of the specifi- 
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cation deals with untyped task nets, where all tasks show a predefined standard 
behavior. This base specification can be used for ad hoc workflows in the case of 
chaotic processes. Moreover, executing ad hoc workflows may provide us with the 
process knowledge to be gathered for process analysis so that we may ultimately 
develop a more structured process model. 

In order to feed domain-specific process knowledge into the environment the 
specification can be enhanced in two ways. On the one hand, a specific process 
schema as an extension to the existing metaschema may be defined to provide 
structural constraints for the evolving task nets. On the other hand, the generic 
operations may be refined in order to introduce specific process execution policies 
like simultaneous or sequential engineering. 

While PROGRES is a very suitable environment to dcclaratively specify the 
generic model, it can hardly be offered a process modeler as a process modeling 
tool. A process modeler is usually no expert on graph transformations. For this 
reason we decided to enrich our DYNAMITE environment with a modeling tool 
which offers a process modeler comprehensive languages for process modeling. 
To this end we use the Unified Modeling Language (UML) (cf. Section 4). The 
domain-specific part of the PROGRES specification is then generated from the 
UML model (cf. Section 5). 

Using PROGRES’ C-code generation facilities, the management tool’s func- 
tional core is generated from the resulting PROGRES specification. To build a 
proper tool with a graphical user interface we make use of a framework for graph 
based applications that ships with the PROGRES environment [12]. 

Introducing a modeling tool serves multiple purposes: The modeling task is 
simplified significantly. Process models can be expressed quite naturally and can 
thus be better communicated to others and reasoned about. Building a process 
modeling tool allows us to offer online semantic analyses to guide the process 
modeler and enables us to support reuse of model fragments. 

4 Process Modeling in UML 

To increase model understandability and to decrease model maintenance effort, 
process modeling should be supported on a very high level of abstraction. In 
contrast to other approaches introducing a special purpose process modeling 
language (EPOS [10], JIL [25], MADAM [21]) this section will demonstrate how 
process models can be adequately defined in a standardized object-oriented mod- 
eling language, the UML. 

4.1 Structural Modeling 

The process specification models processes on the type level. Resulting process 
schemas thus abstract from a multitude of instance level task nets (examples 
of which were presented in Section 2). A process schema’s structure consists 
of task, parameter and realization classes and their various interdependencies. 
Consequently, we use UML’s class diagrams to model this structure. 
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Fig. 3. Task interface and realization 



Conceptually, we distinguish a task’s interface from its realization. The in- 
terface defines a task’s contract such as its parameter profile and its external 
behavior. It abstracts from a set of possible realizations, one of which can be 
selected by the corresponding actor at process enactment time. 

Figure 3 shows the interface of the task class Handle Extension Request at the 
top. We make explicit use of UML’s stereotype concept and introduce stereotypes 
for task classes (symbolized by a black rectangle inside of the class box), and 
input and output parameter classes (symbolized by a white and black circle, 
respectively). Stereotypes are used to express the underlying process meta model 
(DYNAMITE). 

The shown interface consists of the task class and its composed input and 
output parameter classes. Cardinalities may be defined together with the com- 
positions to restrict the number of parameters of a certain class on the instance 
level. The interface is stored in a separate package. Our usage of UML’s package 
concept is explained in detail later on. 

A task class composes all its respective realization classes (symbolized by a 
white rectangle). Since the interface abstracts from a set of possible realizations, 
these realization classes are not part of the interface of a task package. Rather an 
individual package is introduced for every realization class. Realization classes 
abstract from complex schematic realizing subprocess definitions as shown in the 
bottom part of Figure 3. These allow for the composition of other task classes 
through the following associations; Control flow associations introduce a tern- 
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poral ordering on tasks and are similar to ordering relations in PERT-ctiarts. 
Feedback flow associations are always directed oppositely to control flow associ- 
ations and arc introduced to mark iteration or exception steps of a process. The 
direction of these associations is indicated through the role names sre and trg. To 
make these associations distinguishable, we introduce corresponding stereotypes, 
namely 4tcflow^ and ^fback». 

These association types are sufficient to model the flow of control in a group 
of tasks. For example, the Analyse Request and Redesign Application classes are 
connected by a control flow association which indicates that application redesign 
cannot take place before analysis of the request (however, execution may over- 
lap). Analogously, feedback is allowed between the Bottom- Up Test and the Im- 
plement Module or Change Module classes in case of errors during the test. Of 
course, feedback could equally well occur in other parts of this subproccss model, 
which is not shown here. 

Control flow and feedback flow associations define potential channels for 
data flow, which is explicitly modeled through data flow associations (stereotype 
<Cdflow^) between parameter classes. Data flow associations can be introduced 
between an input and an output parameter class, with the output class playing 
the role of source. Equally, they may be introduced between two input or two 
output classes if the corresponding task classes are hierarchically related. This 
gives the complex parent task the possibility to supply the refining tasks with 
its input data and to receive the results from the refining subprocess. 

In our example the Analyse Request task is supplied with the initial extension 
request. There, the request is transformed into an extension specification, which 
is subsequently sent to the Redesign Application task. After all modules have been 
changed or implemented and successfully tested, the Extended System is sent to 
the parent task as the result of process execution. 

Again, cardinalities are used to restrict the number of instances of the various 
classes at process enactment time. In the example the number of Analyse Request 
and Redesign Application tasks is restricted to exactly one, while Change Module 
and Implement Module tasks may occur in any number at enactment time. Their 
actual number will depend on the number of modules that have to be changed 
or implemented from scratch, respectively. 



4.2 Model Structuring 

The structuring of resulting models is vital for model comprehension and reuse. 
We use the UML’s package concept for this purpose. Packages are utilized to 
group closely related modeling elements together, to offer a separate name space, 
and to define visibilities. Each task and realization definition is stored in a sep- 
arate package. These packages arc distinguished by their respective stereotypes 
<CTaskPackage:^ and <cRealizationPackage». Figure 3 shows a sample task and 
realization package. 

It remains to be explained how packages can be interrelated and what kind 
of visibility has to be defined for them. To establish a flexible model structuring 
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Fig. 4. Usage of packages 



concept, the scope of task classes for a model has to be pointed out. For this 
purpose wc distinguish three categories of task classes: 

1. A task class’ scope can exceed one process model. These task classes are 
highly reusable and will be called general task classes in the following. An 
example for a general task class is one for the software test which may be used 
in process models for software extension, correction, or initial development. 

2. Other task classes arc very specific and only relevant for one realization class 
alone. They will be denoted as specific task classes. If we look at a scenario 
where bottom-up testing can alternatively be performed as a black- or a 
white-box test, each of the corresponding realization packages will naturally 
contain a different task class for the development of test cases. 

3. A third set of task classes is characterized by their relevance for various 
realization classes of the same task class. Their scope is thus local to one 
task class and they will be called local task classes in the following. If an 
alternative to bottom-up testing is top-down testing, a task class for test 
driver development will only be needed for bottom-up testing. However, it 
will be needed regardless of how the bottom-up test will be refined. Thus, it 
is declared as a local task class for bottom-up testing. 

If we use the same categories for the task packages containing task classes, 
we can derive a suitable model structuring concept from this categorization (cf. 
Figure 4). Local task packages may be nested into general or other local task 
packages. Every task package contains its respective realization packages while 
each realization package contains its specific task packages. 

To reach abstraction and information hiding within process models, adequate 
visibilities have to be defined [22] . The visibility of realization packages is inher- 
ently private (-), because only the interface of a task class needs to be known to 
potential users. General task packages are provided with a visibility of public (-I-) 
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since their contained task interface can be used in any context. The offered task 
interface of local task packages should in contrast only be visible to realization 
packages nested into the same task package but not to importers of the latter. 
For this reason, the visibility of local task packages has to be protected (#). Spe- 
cific task packages are nested into realization packages. Since their visibility is 
private no package outside of a realization package can access the contained task 
packages. The visibility of specific task packages will thus be set to public (+), 
which allows the superior realization package to use the offered task interface. 

The advantages of using UML packages in the afore mentioned way are man- 
ifold. They allow for the modular modeling of processes so that complex pro- 
cesses are separated into manageable fragments and loose coupling of models is 
reached. In addition, reusable process fragments can be identified since general 
and local task packages are subject to reuse. The concept introduces abstraction 
and information hiding in several ways, whieh allows for the manipulation and 
exchange of process model fragments with little impact on the overall model. 



4.3 Behavioral Modeling 

Every task class owns a generic set of m,ethodn and an attribute for the current 
execution state. The interrelations of these methods are modeled in a prescribed 
state diagram (Figure 5, part i). Methods can be subdivided into the set of 
state changing methods (e.g.. Start, Plan, Suspend), the set of data flow related 
methods (e.g.. Consume, Produce, Release) and the set of methods to edit task 
nets (e.g., CreateTask, CreatePara meter, CreateControlflow). The figure shows all 
state-changing methods but gives only examples of state-preserving ones. 

Prescribing one state diagram for all task classes may seem a little restrictive, 
but allowing the modeler to define a specific state diagram for every task class 
would imply the need to model the interrelations between state diagrams for 
every pair of task classes. The resulting increased complexity would counteract 
our aim to simplify the task of modeling a process. Experience has shown that 
the state diagram of Figure 5 is sufficient to adequately model a software process. 




Fig. 5. (i) Generic and (ii) (cut-out of) specific state diagram 
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The state diagram allows tasks to be currently in definition, to be waiting for 
activation, to be active, suspended, or (re-)planned. Every task can terminate 
in one of two final states, one of which marks its successful completion (Done), 
while the other one marks its failure (Failed). 

Within the underlying process engine some standard behavior is defined with 
respect to this state diagram. Methods consider the states of preceding and 
succeeding tasks as well as of subordinate and superior tasks. For example, a 
task may only terminate when all predecessors have terminated. In addition, the 
suspension of a task leads to the suspension of all subordinate tasks. 

Every method being executed by a task by default sends out a corresponding 
event to its predecessors, successors, children, parent, and itself. The underlying 
process engine provides an event/trigger mechanism that triggers the execution 
of event handlers in the receiving objects. These event handlers enable task 
objects to react to actions performed in their respective context. 

This standard behavior can be adapted by the process modeler in three dif- 
ferent ways. The formulation of transition conditions in the Object Constraint 
Language (OCL) provides the means to influence the way a task net is executed. 
Furthermore, UML allows for the definition of action calls inside of a state to 
automate execution steps. In addition, an action can be triggered by entering or 
by leaving a state. Finally, the set of event-receiving tasks can be restricted by 
using UML's send-clauses as part of transition definitions. 

A cut-out of a sample adapted state diagram of the Handle Extension Request 
task class is shown in part ii of Figure 5. The start transition can only be 
fired when all inputs for the task are available (the standard behavior does not 
demand this). In addition, all inputs can now be automatically consumed when 
a task of this type enters the state Active, since it is ensured that they are 
available. Furthermore, the sample state diagram restricts the target set of the 
CreateFeedback event to the task itself. In this case it makes sense to restrict the 
standard behavior of sending an event to the whole context, since we want this 
event to be handled by a complex task net transformation (see below). 

Defining the specific behavior of a process model includes the specification 
of custom event handlers. An event handler can be specified for every task class. 
UML allows to specify a method’s semantics in any language, like C+-F oder 
pseudo code. For example, the automatic activation of a task, dependent on 
certain data being released by a predecessor, can be modeled with task class 
specific event handlers. However, the expressiveness of these event handlers is 
very limited since the task class’ context is not known at the time of its specifica- 
tion (in fact, a task class may be reused in different contexts, i.e. realizations of 
complex tasks). We therefore allow for the definition of realization class specific 
event handlers. A realization receives all events that are being sent to its owning 
task. This way events can lead to very complex task net transformations since 
the structure of the subprocess is known to the realization class. 

Complex event handlers can be specified through UML’s collaboration dia- 
grams which are used to define the semantics of an event handling method. A 
collaboration diagram consists of objects and links that are required to be avail- 
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Collaboration diagram as semantics-definition 




Fig. 6. Complex event handler 



able when the method is executed. Additionally, objects and links can be created 
and destroyed during the execution of the specified method. The communication 
between objects can be defined through messages which may refine links. 

Figure 6 gives an example of how a CreateFeedback event can be handled 
automatically through this mechanism. The event handler is defined for feed- 
back occurring between tasks of type Bottom Up Test and Redesign Application. 
In addition to the feedback flow’s source and target objects we search for an 
intermediate task of type Implement Module and some parameter objects. The 
event handling method then creates two parameter objects: an output parameter 
at the Bottom Up Test task and an input parameter at the Redesign Application 
task. Creation of new objects is done through sending a -Ccreate;^ message (see 
messages mimbered 1 and 2 in Figure 6). By installing a data flow link between 
these parameter objects, the feedback flow is refined and an error report can be 
sent to the feedback flow’s target. Links created by the method arc marked with 
the constraint {new}. Finally, the tasks’ behavior is specified through messages. 
In the example case, the feedback flow’s source is suspended (message 3), the 
erroneous output document of the feedbaek flow’s target is retracted (message 
4), and the intermediate implementation task is suspended (message 5). 

Specifying the behavior through state diagram adaptations and the defini- 
tion of custom event handlers may still be too work intensive. Experience has 
shown that a limited set of behavioral patterns is used for most behavioral 
specifications. Examples for behavioral patterns are sequential and concurrent 
engineering as properties of control flow associations. To simplify the modeling 
of specific behavior, we propose the use of one of UML’s extension mechanisms, 
the tagged values. For example a tagged value of name policy could be defined 
for the stereotype <ccflow;^ with possible values of sequential and concurrent. 
Our eventual goal is to come up with a library of predefined behavioral patterns 
which may be combined easily at a high level of abstraction. 

5 Generating Executable Process Models 

UML itself is not executable, i.e., process models described in UML can neither 
be simulated nor enacted. However, they can be transformed into an executable 
form as follows. 
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Fig. 7. Graph schema extension 



A process specification in UML is transformed into a corresponding specifi- 
cation in PROGRES which extends the base specification (cf. Section 3). UML 
class diagrams are transformed into a graph schema extension defining domain- 
specific task and relation types (Figure 7). The figure shows a cut-out of the 
generic graph schema which defines a process meta schema. It contains node 
types as abstractions of process objects (TASK, INPUT,...) and process object 
relations (FEEDBACK, DATAFLOW,...). The latter have to be modeled as node 
types because they carry attributes which are important for the enactment and 
manipulation of instance level task nets and PROGRES docs not support at- 
tributed edges. 

The domain specific process schema which is modeled in UML is transformed 
into node types as an extension to the generic graph schema. The modeled UML 
associations and aggregations are mapped to node types carrying schema-level 
attributes as shown in Figure 7 for the source and target types of task relations. 
This way the instantiation of the meta schema’s edge types is restricted according 
to the UML model. 

State transitions, automated action calls, and event dispatchers are trans- 
formed into procedures calling the generic model’s base operations. 

Graphical definitions of event handlers as presented in Figure 6 are trans- 
formed as follows: Objects and links that are required to be available for the 
event handler’s execution (i.e. a feedback’s source and target) are transformed 
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test Thandle_CreateFeedback_Event ( Src : Bottom_Up_Test; 

Trg : Redesign_Application; 

out Strd : Standard; 

out Impl_Module : Implement_Module; 

out FB : FEEDBACK; 

out Mspecl : ModuleSpecO; 

out Mspec2 : ModuleSpecI; ) 




FB := '4 ; Mspecl :='5; Mspec2 := '6; 

end ; 

transaction handle_CreateFeedback_Event (Src, Trg : TASK) = 

(* declaration of local variables for process objects*) 
Thandle_CreateFeedback_Event (Src : Bottom_Up_Test, 

Trg ; Redesign_Application, ...) 
& CreateParameter (Src, Error_ReportO, out ERl) 

& CreateParameter (Src, Error_ReportI, out ER2) 

& CreateDataflow (ERl, ER2, FB) 

& Suspend (Src) 

& UnRelease (Mspecl , Impl_Module) 

& Suspend(Trg) 
end ; 



Fig. 8. Transformed event handler 



into a PROGRES graph query. A graph query searches for and returns the graph 
nodes representing the needed process objects. Objects and links created during 
the method’s execution are created within the graph through the specihed base 
operations of the generic model. All message calls to objects (i.e. messages 3-5 
in Figure 6) are transformed into corresponding calls to base operations. 

Search and replacement of an object structure could best be realized as a 
graph transformation within PROGRES. However, the creation and deletion of 
process objects through a graph transformation would ignore the semantics of 
dynamic task nets which are contained in the base operations. This circumstance 
is also known as the graph rewriting dilemma (violation of data abstraction). 

Figure 8 shows a sample graph query and procedure (called transaction in 
PROGRES) for the collaboration diagram in Figure 6. Starting with the node for 
representing the created feedback flow, nodes representing the feedback flow’s 
source and target task, the affected tasks and significant parameters are ex- 
tracted from the graph and returned to the transaction which calls the appro- 
priate base operations on these objects. 

Since currently only one unstructured graph schema is supported by PRO- 
GRES, the provided package structure of the UML model is lost during transfor- 
mation. However, a module concept for PROGRES is currently being developed 
[23] which will enable to preserve the structure in the transformed model. 
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Technically, the transformation from UML models to PROGRES is per- 
formed in the following way: We use the commercial CASE tool Rational Rose as 
a modeling tool. Rose offers mechanisms to access its operations and the current 
model through a Component Object Model (COM) interface. The implementa- 
tion of the transformation makes use of this interface to read the current model 
and generates the PROGRES-Code as a text file. 

Note that no manual implementation is involved here. Rather, the UML 
specification is transformed automatically into a PROGRES specification (which 
is translated into C code by the PROGRES compiler). In this way, we combine 
a high-level, executable specification language with an object-oriented modeling 
language constituting an emerging standard that will be used widely. 



6 Lessons Learned 

Object-oriented software process modeling. Modeling tasks as objects is 
quite natural. Dynamic task nets are created, modified, analyzed, and executed 
during the course of a software project. Therefore, a task net can be represented 
as an evolving object structure on which different components of a process man- 
agement environment operate (c.g., editor, planner, analyzer, or execution tool). 

UML provides a rich set of diagrams for modeling both the structure and the 
behavior of dynamic software processes. In this paper, we have focused on the 
use of class diagrams for structural modeling and the use of state diagrams and 
collaboration diagrams for behavioral modeling. In addition, we have applied the 
UML package concept to structure process models (modeling-in-the-large) . Due 
to the lack of space, we have described only a subset of the diagrams which we 
actually use in our approach. For example, we employ both use case diagrams 
and object diagrams for process analysis [13]. 

However, we do not exploit the full range of diagrams offered by UML. In 
particular, we have dismissed activity diagrams even though they were added to 
UML for business process modeling. Unlike dynamic task nets, activity diagrams 
are statically defined at modeling time. Since activities are not modeled as ob- 
jects, it is not possible to model the dynamics of software processes at enactment 
time. 

Adapting UML. It is important to understand that we do not simply apply 
UML as it is. Rather, we adapt UML according to our process meta model. 
To this end, we employ the extension mechanisms offered by UML: stereotypes, 
constraints, and tagged values. 

Unfortunately, these mechanisms support metamodeling only to a limited 
extent. Rather, we would like to use UML itself as a full-fledged metamodeling 
language. The syntactical structure of dynamic task nets could then be defined in 
a class diagram and constraints could be added to define their static semantics. 
However, UML does not provide a layer for building domain-specific meta models 
in this way [18]. Note that defining a meta model by extending UML’s own meta 
model is an unsatisfactory approach. While adding new base classes increases 
UML’s modeling capabilities for a domain [27], it does not restrict UML with 
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regard to the domain’s modeling constraints. In addition, concerns are not well 
separated. 

From informal to formal process models. As such. UML is an informal 
language in that process models written in UML cannot be enacted. In contrast, 
process modeling languages for process-centered software engineering environ- 
ments must be enactable, i.e., a process model must have well-defined execution 
semantics. 

As described in Section 5, we have solved this problem by defining a map- 
ping from UML process models to programmed graph rewriting systems. The 
mapping is partial because UML models must conform to the underlying pro- 
cess meta model. It is also deterministic, i.e., no user interaction is required to 
generate PROGRES code from a UML model. 

It should bo noted that wc have not attacked the problem of defining generally 
accepted semantics for the UML. We have restricted ourselves to those parts of 
UML which are relevant for our proeess modeling approaeh. In addition, we 
define the semantics according to our process meta model. 

UML as a unified process modeling language? In response to the large 
number of object-oriented modeling approaches, UML has emerged as a standard 
notation which makes communication easier. Instead of learning many different 
notations for the same concept, modelers may stick to just one wide-spread 
standard notation. 

In software process modeling, researchers tend to define their own languages 
even if their underlying concepts could be expressed in a standard notation. So 
we decided to investigate the use of a standard notation for process modeling. So 
far, we consider our experiment as successful. Class diagrams, state diagrams, 
collaboration diagrams etc. provide us with well-suited modeling elements for 
software process modeling. By using a wide-spread modeling language, we hope 
to increase the acceptance of our process modeling approach — and to leverage 
the communication with its users, who may not have any background in software 
engineering at all (note that we apply dynamic task nets also in mechanical and 
chemical engineering [17J). 

However, concerning the unification of process modeling approaches, we are 
not too optimistic. First, many process modeling languages are not object- 
oriented or cannot be naturally expressed in an object-oriented modeling lan- 
guage such as UML (e.g., rule-based languages [2]). Second, UML can be applied 
in radically different ways to process modeling. Even when the same notation is 
used, the underlying process meta models may vary considerably. 

7 Related Work 

Throughout this paper, we have been concerned with process modeling in UML. 
We have not discussed process modeling for UML, i.e., methods for applying 
UML to software development [11]. Our approach provides a fairly general meta 
model which can be applied to a wide range of processes — not only in software 
engineering, but also in other domains. 
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Process modeling in UML is often performed with the help of activity di- 
agrams [28,14]. Activity diagrams have been introduced for modeling business 
processes, which causes several problems when being applied to software pro- 
cesses. Activity diagrams are viewed as process programs, even though software 
processes are hard to predict. Modeling processes as objects is more adequate 
since it allows for representing process evolution. 

The software process community has developed a wide range of process mod- 
eling languages. By and large, researchers preferred developing their own lan- 
guages rather than using wide-spread object-oriented notations. An exception to 
this is ESCAPE+ [19], which is partially based on OMT and employs class and 
state diagrams for process modeling. However, ESCAPE+ focuses on process 
specification and does not address process analysis. Moreover, the underlying 
process meta model differs considerably from DYNAMITE (e.g., ESCAPE+ 
models tasks as operations attached to document objects, while DYNAMITE 
models them as first-class objects). Further object-oriented languages for soft- 
ware process modeling not specifically committed to standard notations are E3 
[1] and SOCCA [5]. 

Let us now compare the process meta model to other work at a conceptual 
level, ignoring the notations used (note, however, that usage of a standard no- 
tation is the essential contribution of this paper). In many approaches, process 
models are viewed as programs to be executed at run time. This direct execution 
paradigm is realized e.g. in modeling languages based on procedural program- 
ming [26] or Petri nets [4]. In contrast, we follow an indirect execution parddigm: 
the (program) task net to be executed is constructed only at run time. In this 
respect, our approach is similar to rule-based approaches, where a plan is con- 
structed dynamically [10]. However, we believe that task nets can be created 
at best semi-automatically; in particular, project managers need the ability to 
build up and modify task nets manually. 

Our approach combines UML with graph rewriting. In [8], a specification 
language (Fujaba) is described which combines UML and graph rewriting as well. 
However, our intent is not to introduce a new specification language. Rather, we 
focus on the application of an existing modeling language (UML) to process 
modeling and define the semantics of UML models by an application-specific 
mapping from UML to PROGRES. 

8 Conclusion 

We have demonstrated the use of UML as a process modeling language. More- 
over, we have shown how UML can be employed as a “front end” to a process 
management environment. To this end, a formal interpretation for UML process 
models is provided through a translation into PROGRES. 

Using UML for software process modeling provides us with several benefits. 
First, dynamic task nets lend themselves quite naturally to object-oriented mod- 
eling. Second, UML provides a rich set of diagrams for describing process mod- 
els. Third, it assists the earlier phases of process model development. Fourth, we 
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may attach semantics to UML process models by mapping them to programmed 
graph rewriting systems. However, we have also discovered some problems, par- 
ticularly concerning the selection of a useful subset of diagrams and the adap- 
tation of UML (missing metamodeling facilities). Finally, we acknowledge the 
advantages of using a standard notation, but we also have some reservations 
with respect to the degree of unification in software process modeling that can 
be achieved in this way. 

Before having switched to UML as a process modeling language, we have 
designed MADAM (Management and Adaptation of Administration Models), 
a high-level language consisting of both graphical and textual elements [21]. 
We have also implemented a modeling environment supporting the creation of 
MADAM models and their translation into a programmed graph rewriting sys- 
tem (the details of this translation can be found in [15]). The translation of UML 
process models has been performed according to the same philosophy. 

The process management environment (DYNAMITE) is currently being re- 
designed and -implemented [12]. In particular, we arc going to enhance the user 
interface with the help of a commercial tool-kit (ILOG JViews), and we have 
selected Java as our new implementation language (formerly, we used C and 
Tcl/Tk). 

References 

1. M. Baldi, S. Gai, M. L. Jaccheri, and P. Lago. E3: Object-oriented software process 
model design. In Finkelstein et al. [7], pages 279-292. 

2. N. S. Barghouti and G. E. Kaiser. Scaling up rule-based software development 
environments. International Journal of Software Engineering and Knowledge En- 
gineering, 2(l):59-78, Mar. 1992. 

3. G. Booch, J. Rumbaugh, and I. Jacobson. The Unified Modeling Language User 
Guide. Addison Wesley, Reading, Massachusetts, 1998. 

4. W. Deiters and V. Gruhn. The FUNSOFT net approach to software process man- 
agement. International Journal of Software Engineering and Knowledge Engineer- 
ing, 4(2):229-256, 1994. 

5. G. Engels and L. Groenewegen. SOCGA: Specifications of coordinated and coop- 
erative activities. In Finkelstein et al. [7], pages 71-102. 

6. G. Engels and G. Rozenberg, editors. TAGT ‘98 — 6th International Workshop 
on Theory and Application of Graph Transformation, number tr-ri-98-201 in Series 
in Computer Science, Paderborn, Germany, Nov. 1998. Department of Computer 
Science, University of Paderborn. 

7. A. Finkelstein, J. Kramer, and B. Nuseibeh, editors. Software Process Modelling 
and Technology. Research Studies Press (John Wiley & Sons), Chichester, UK, 
1994. 

8. T. Fischer, J. Niere, L. Torunski, and A. Zundorf. Story diagrams: A new graph 
grammar language based on the unified modeling language and Java. In Engels 
and Rozenberg [6], pages 112-121. 

9. P. Heimann, G. Joeris, C.-A. Krapp, and B. Westfechtel. DYNAMITE: Dynamic 
task nets for software process management. In Proc. IGSE ‘18, pages 331-341, 
Berlin, Mar. 1996. 




108 



D. Jager, A. Schleicher, and B. Westfechtel 



10. M. L. Jaccheri and R. Conradi. Techniques for process model evolution in EPOS. 
IEEE Transactions on Software Engineering, 19(12):1145-1156, Dec. 1993. 

11. I. Jacobson, G. Booch, and J. Rumbaugh. The Unified Software Development 
Process. Object Technology Series. Addison- Wesley, Reading, Massachusetts, 1999. 

12. D. Jager. Generating tools from graph-based specifications. In J. Gray, editor, 
Proceedings First International Symposium on Constructing Software Engineering 
Tools, pages 97-107, Los Angeles, May 1999. University of South Australia, School 
of Gomputer Science. 

13. D. Jager, A. Schleicher, and B. Westfechtel. Modeling dynamic software processes 
in UML. Technical Report 98-11, RWTH Aachen, 1998. 

14. A. Korthaus. Using UML for business object based systems modeling. In Schader 
and Korthaus [20], pages 220-237. 

15. G.-A. Krapp. An Adaptable Environment for the Management of Development 
Processes. Number 22 in Aachener Beitrage zur Informatik. Augustinus Buch- 
handlung, Aachen, Germany, 1998. 

16. P. Lawrence, editor. Workflow Handbook. John Wiley & Sons, Ghichester, UK, 
1997. 

17. M. Nagl and B. Westfechtel, editors. Integration von Entwicklungssystemen in 
Ingenieuranwendungen. Springer- Verlag, Heidelberg, 1998. 

18. Rational. UML Semantics, 1.1 edition, Sept. 1997. http://www.rational.com/uml. 

19. W. Reimer, W. Schafer, and T. Schmal. Towards a dedicated object oriented 
software process modelling language. In Object-Oriented Technology — ECOOP 
‘91 Workshop Reader, LNGS 1357, pages 299-302, Jyvaskyla, Finland, June 1997. 

20. M. Schader and A. Korthaus. The Unified Modeling Language — Technical Aspects 
and Applications. Physica- Verlag, Heidelberg, Germany, 1998. 

21. A. Schleicher. High-level modeling of development processes. In B. Scholz-Reiter, 
H.-D. Stahlmann, and A. Nethe, editors. Process Modelling, pages 57-73, Gottbus, 
Germany, Feb. 1999. Springer- Verlag. 

22. A. Schiirr and A. Winter. Formal definition of UML’s package concept. In Schader 
and Korthaus [20], pages 144-160. 

23. A. Schiirr and A. Winter. UML packages for programmed graph rewriting systems. 
In Engels and Rozenberg [6], pages 132-139. 

24. A. Schiirr, A. Winter, and A. Ziindorf. Graph grammar engineering with PRO- 
GRES. In Proc. ESEC ‘95, LNGS 989, pages 219-234, Barcelona, Spain, Sept. 
1995. 

25. S. Sutton and L. Osterweil. The design of a next generation process language. In 
M. Jazayeri and H. Schauer, editors, Proc. ESEC ‘97, LNGS 1301, pages 142-158, 
Ziirich, Switzerland, Sept. 1997. 

26. S. M. Sutton, D. Heimbigner, and L. J. Osterweil. APPL/A: A language for 
software process programming. ACM Transactions on Software Engineering and 
Methodology, 4(3):221-286, July 1995. 

27. P. S. Tom Mens, Carine Lucas. Supporting reuse and evolution of UML models. In 
J. B. Pierre- Alain Muller, editor. Proceedings of UML’98 International Workshop, 
pages 341-350, Mulhouse, France, 1998. 

28. G. Versteegen. Objektorientierte Geschaftsprozefimodellierung mit der UML: die 
Innovator Business Workbench. OBJEKTspektrum, (l):62-67, Jan. 1998. 




A Probabilistic Model for Software Projects 



Frank Padbcrg* 

Fachbereich Informatik, 
Universitat Saarbriicken, Germany 



Abstract. A probabilistic model for software development projects is 
constructed. The model can be applied to compute an estimate for the 
development time of a project. The chances of succeeding with a given 
amount of time and the risk of deviating from the estimate can be com- 
puted as well. Examples show that the model behaves as expected when 
the input data are changed. 



1 Introduction 

At an early stage of a software development project the managers need to estab- 
lish an estimate for the development time of the project. This is a difficult task 
since an estimate depends on many factors which are unknown at that time. The 
managers also need to analyze the risk of a project. They must find answers to 
more detailed questions such as 

What are the chances that the project will be successfully completed 
within two years? 

The answer much depends on which course the project will take. Think of the 
course of a project as describing ’’what happens at what time”. The course is 
not known in advance, so uncertainty and risk are inherent to any project. The 
best one can do is to make some sort of assessment based on one’s experience 
with past projects. The central idea of this paper is to put such assessments on 
a solid mathematical basis by constructing a probabilistic model for software 
development projects. A first model is presented in this paper. 

The core of the model consists of formulas for computing the probabilities of the 
courses that a project could take. A project is modelled as a random process 
whose state changes from time to time. This leads to a formal description of a 
course of the project as a sequence of states, see subsection 3.3. The transitions 
between the states are controlled by the transition probabilities from which the 
probabilities of the project’s courses can be computed, see subsection 3.4. 

The input data required for computing the transition probabilities are statistical 
data and design data. The statistical data arc a measure for the pace progress 
was made with in previous projects, bringing in the experience made in the past. 
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The raw data needed to build up a database could be collected by the manager 
during running projects with little effort, see subsection 4.3. The design data are 
a measure for the degree of coupling between the software product’s components, 
see subsection 4.6. The stronger the degree of coupling is the more likely it is that 
changes in one component will propagate into the other components. The design 
data have to be extracted from some early high-level design of the software. 

Once the probabilities of a project’s courses are known, the chances of succeeding 
with a given amount of time can be computed. This is done by summing up the 
probabilities of those courses that would take at most the given time to complete. 
An estim,ate for the development tim,e is computed as the ’’weighted average” 
of the lengths of the courses, the weights being the probabilities of the courses. 
The probability or risk that the time actually needed to complete the project 
will exceed the estimate by some chosen amount of time can be computed, too. 
This provides an error bound for the estimate. See subsection 3.5. 

The approach is not tailored to a specific software process model, see section 2. 
To get through the complicated mathematics, in the model presented here several 
simplifying assumptions were made : 

• The number of development teams equals the number of components in the 
high-level design. 

• The teams start working at the same time and keep working until their 
components arc completed. 

• The customer requirements do not change during the project. 

• The number of components, the degree of inter-component coupling, and the 
complexity of each of the components remain the same through all changes 
of the high-level design. 

The assumptions arc preliminary but limit the applicability of this particular 
model. In particular, volatility in the customer requirements is considered to be 
one of the leading causes for schedule and cost overrun, see [6] . Therefore, the 
model should be regarded as just a first step towards a comprehensive proba- 
bilistic model for software projects. It has not been tested using real world data 
so far, and it is not meant to be used in practice at this early stage. 

There is much theoretical and empirical research on software project estimation. 
For an overview of the common models and many references see [10] [ 11 ] . The 
common models, in particular the curve-htting approaches, tend to produce 
unreliable estimates, see [3] [4] [5] [6]. This might be explained by the fact that 
the models do not explicitly consider the uncertainty inherent to a project. There 
also are problems with calibrating the models and, most notably, with obtaining 
a valid size estimate as their basic input variable, sec [1] [4] [7] [8]. Since there 
is a strong need in the software industry for reliable estimates, increasing effort 
is being spent on new approaches such as machine learning, analogy and neural 
networks, see [ 12 ] [ 13 ] [ 14 ] . For an overview see [ 2 ] . 

The approach presented here differs from the existing models in several respects. 
First, it is based directly on modelling the courses of a software project. It empha- 
sizes that the progress of a team depends on the other teams’ progress. Second, 
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it uses a standard mathematical framework for modelling a process whose line 
of development is not known in advance. This includes explicit modelling of the 
risk of a project. As far as I know, computing the chances of a project succeeding 
and computing error bounds for the estimate is not possible with the existing 
models. Third, the experience made with past projects enters in a standard and 
comprehensible way as statistical data which will automatically adjust to the 
local environment. 



2 Software Projects 



A software project is considered to consist of several development teams and a 
project manager. Based on some early high-level design, the software product 
gets divided into components. Each team is assigned one component and vice 
versa. The teams start working at the same time. There are no assumptions made 
about the software engineering methods or process models used by the individual 
teams. The teams work simultaneously, but not independently. They do not 
communicate directly, but there will be an interaction between them in case of 
a team detecting a problem with the system’s design. Since the components 
are coupled, such a problem is likely to affect other teams as well, so the team 
reports the problem to the manager. The manager makes sure that the system’s 
design gets revised to eliminate the problem. The team that has reported the 
problem waits until redesigning is finished. The other teams keep working. If 
there are additional problem reports while the design is being revised, they are 
taken into account, too. When the redesign is completed, the manager tells each 
team whether it is affected by the design changes or not. 




In this example, the redesign results in changes to the components that the 
first and second team are working on. The third team just continues without 
changes. It may happen that a team that has already finished has to go over 
its component again because of design changes. To sum up, the progress of a 
team depends on the other teams’ progress, the teams being linked through the 
system’s design. When all teams have reported to the manager that they have 
finished developing, the components are put together and the system gets tested. 
If errors arc detected, a new development cycle begins. 

The model will describe a development cycle probabilistically. 
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3 The Model 

3.1 Time 

In the model, time is discrete. The time axis gets subdivided into periods of equal 
length, called time slices . The length of a slice can be chosen as is suitable. Think 
of a slice as corresponding to, say, a month in real time. Whether a team has 
detected a problem or has finished working during a slice will be determined at 
the end of the slice. The exact point of real time at which something happened 
will be disregarded. 

There is another subdivision of the time axis. As a project advances, the system’s 
design will be revised from time to time because of problems. The time span 
between two consecutive redesigns is called a phase. Each phase consists of a 
number of slices that may vary from phase to phase. 

3.2 States 

At any point of discrete time the state of a team is represented by a natural 
number, zero included. This number is to express the progress that the team 
has made up to this point. One could think of several ways of measuring a 
team’s progress. In this version of the model the state of a team is defined as 
the number of slices that have passed since the team last reported a problem 
or was last interrupted by the manager because of being affected by changes 
in the system’s design. A value of infinity indicates that the team has finished 
developing. 

The state of a project at the end of a phase is defined as a vector 

C = (Cl, C2, ... Cn) 

where (ji is the state of team number i at the end of the phase and N is the 
number of teams. A state vector ofoq= (oo,oo, ... oo) means that all the 
teams have finished. Note that ( 4 , oo , 2 ) would not be a valid state vector, 
because at the end of any phase either all teams have finished or at least one of 
the teams is set back to zero. 

3.3 Course of a Project 

The course of a project is modelled as a sequence 

C(l), C(2), ... 

of states, where Ci.'j) is the project’s state vector at the end of phase j. It 
is assumed that the number of teams doesn’t change during the project. The 
sequence of states has to be supplemented by the sequence of numbers 



di , (I2 , 
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giving the length of each of the project’s phases. 

For example, this might be the first two phases of a three-team project : 



0 (4, 0, 4) (0, 6, 0) 

-| — I — I — I — I — I — I — I — I — I — I — I > 

0 4 10 



State 

Time 



State vector (1 = (0,0,0) corresponds to the initial high-level design. The 
project has advanced from state (1 after 4 slices (first redesign) into state 
(4, 0, 4) and after 6 more slices ( second redesign ) into state (0, 6 , 0). 

3.4 Transition Probabilities 

To expand the formal description of a project’s courses into a probabilistic model, 
define the transition probability 



P^(d, p) 

to be the probability for ending the next phase after exactly d slices with 
state r] given that the previous phase ended with state . The formulas for 
the transition probabilities will be constructed step-by-step in section 4. 

Note that the transition probabilities do not depend on any information about 
a project’s history except its current state C- In particular, the transition prob- 
abilities do not depend on the particular sequence of states that the project 
passed through before reaching the current state. The model therefore will be a 
Markov process. For such a modelling to make sense the state must be defined 
to contain all relevant information about the project’s past. 

When all the transition probabilities are known one can compute the probability 
P ( w ) that a project will take a particular course u> simply by multiplying the 
corresponding transition probabilities, 

P(cu) = Po(di, C(l)) • Pc(i)(rf2, C(2)) ■ 

Pj ( fc_i ^ {dk , C ( ) ) • 

For example, the probability that a project will advance in its first two phases 
like in the figure above is Pq(4, (4, 0, 4)) • P( 4 o, 4 )( 6 , (0, 6 , 0)). 

The space describing all possible courses of a project contains arbitrarily 
long sequences u> of pairs ( d , r/ ) corresponding to a successful outcome, but 
also infinite sequences corresponding to a never-ending project. This space is 
uncountable, so it can’t be treated within the framework of discrete probability 
theory. It is well-known how to formally construct from the transition proba- 
bilities a non-discrete probability measure on that space using the methods of 
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general measure theory. For the purpose of this paper one can avoid the technical 
difficulties as follows. For each project there naturally exists a ’’final deadline” 
of, say, xq slices. If this deadline is exceeded, the project will be cancelled as a 
failure. Only the probabilities of those courses get considered which have com- 
pleted before this deadline is reached, together with the remaining probability 
that the project will exceed the final deadline. 



3.5 Estimates 

The model can be applied as follows. Define D"'' to consist of those finite se- 
quences cu of pairs (dj , C ( J ) ) which correspond to a successful outcome of 
the project, that is, for which just the last state equals infinity. Denote by | in | 
the number of phases in cu . Define a function 

/ : ^ IN 

which assigns to each finite course cu its length of time 

tu 

/(^) = 

i = i 

by summing up the lengths of its phases. The probability that the project will 
take exactly x slices to complete is equal to 

p(^)- 

w : f{uj)=x 

The sum runs over all successful courses uj whose length of time is x slices. 
The probabilities P(w) get computed from the transition probabilities, as is 
described in the preceding subsection. Observations that will be made in subsec- 
tion 4.3 ensure that the number of different states of a project is finite and that 
the length of a phase is bounded. It follows that the set {to : f {to) — xjis 
finite, and the sum has only finitely many summands. 

The chances of succeeding with the project before the final deadline of xo slices 
is exceeded are equal to the probability 

(a^o) = go (x). 

X Xq 



The more development time there is at the beginning of the project the greater 
are the chances of succeeding, because is an increasing function. Suppose 
that a project is approaching the final deadline. If the project’s state is 0 after 
y slices, the chances of still succeeding in the remaining time are only equal to 
{xo — y) in the model. The closer the project is to the deadline, the smaller 
these chances of still succeeding are. 
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As an estimate for the development time of a project compute the number 
E time ~ ^ ^ ^ ■ V 

X Xo 

+ Xo • ( 1 - (xo ) ) . 

This is a ’’weighted average” of the lengths of the courses cut off at the final 
deadline xq , the weights being the probabilities of the courses, because 

^ X • ip{x) = ^ /(w) • P(w). 

^ ^0 U) : f {oj) Xq 

Note that in general is not equal to the expected value of the function 

/ expanded to a random variable on the space fl mentioned in the preceding 
subsection. It can be shown that E^j^g converges from below to the expected 
value as xq increases, provided that the expected value exists. 

To obtain error bounds, compute the probability that the length of a project 
will exceed the estimate by at most some number c of slices as 

^ f ^time — c) = 

( L ^ time ~ ([El time J ) • 

All the formulas can be pictured using a bar chart for the probabilities tp ( x ) . 
An example is given in subsection 5.2. 

4 Probabilities 

4.1 Approach 

The formulas for the transition probabilities are constructed in two major steps. 
First, consider the probabilities of the events that can happen during a phase. 
Second, consider the probabilities for the effects that the redesign at the end of 
the phase can have on the teams. Afore precisely, set 

Pc(^’ ^?) = Z] I 

V : d 

Here, ( u ) is defined to be the probability that the phase develops according 
to V, which denotes a course of the phase. The possible courses will be formalized 
in the next subsection. ( r? | w ) is defined to be the conditional probability that 
the phase ends with state q assuming that it developed according to v . The sum 
runs over all possible courses v of length exactly d slices. It corresponds to the 
fact that different courses of a phase may lead to the same next state. As is 
indicated by the lower index all probabilities depend on the previous state 
of the project, that is, the state at the beginning of the phase. 
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4.2 Course of a Phase 

The course v of a phase is known when its length d and the numbers 

1 < ki < . . . < kn < d and 1 < li <...< I m ^ d 

are specified. The numbers kj are the different points in time at which prob- 
lems get reported, measured as the number of slices that have passed since the 
beginning of the phase. The numbers Ij are the different points in time at which 
teams report that they have finished. Define the set Kj to contain the numbers 
of the teams that report a problem at time kj . The set Lj corresponds to time 
Ij . In addition, define the set Lq to consist of the numbers of those teams which 
do not report anything during the phase. This includes the teams which already 
are in state inhnity since an earlier phase. 

4.3 Statistical Data 

To compute the transition probabilities, some input data are required about the 
pace progress was made with in past projects. Define the base probabilities 

P{Ei) and P(D*) 

to bo the probabilities that team number i will report a problem (event E(, ) 
or will reach state infinity (event D(. ) exactly k slices after having been in- 
terrupted for the last time by the manager because of changes in the system’s 
design. Since the D(. and E(. are disjoint events, for hxed i 

P(Et) + P(Dt) = 1. 

k 

A team will eventually reach state infinity or report a problem if it doesn’t get 
interrupted. This gives a bound fco such that P(D(.) = P(E(.) = 0 for all 
fc > fco j so the number of states of a project is hnite. 

The base probabilities have to be computed from raw data about the courses of 
past projects. During a running project, the raw data needed for future use get 
collected like this. Given that a slice corresponds to one month, the manager has 
to write down at the end of each month 

• the number of each team which reported a problem; 

• the number of each team which has hnished; 

• whether there was a redesign; 

• if there was a redesign, the number of each team which was affected by design 
changes. 

The base probabilities get computed as average values on a team-by-team basis 
from the raw data sets of as many projects as possible. If the data sets come 
from a single organization, the values will automatically adjust to the specihe 
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environment in that organization. Updating the database and computing the 
base probabilities could be supported by a tool. 

For example, suppose that we have the data sets from 15 projects and that a 
particular team participated in all the projects. Assume that the team has been 
interrupted by a redesign for a total number of 35 times during these projects 
(this is 2.3 times on average). Also assume that it happened 4 times that 
the team reported a problem exactly 2 months after it had been interrupted 
by a redesign for the last time or right after the project had started. The base 
probability P ( E 2 ) for the team would be set to 4 / ( 35 + 15 ) , or 8 percent. 

The base probabilities for a team depend upon various factors, for example 

• the software process employed by the team: 

• the complexity of the component that the team has to develop ; 

• the skills and the experience of the team members. 

Therefore, the manager should distinguish between different team productivity 
levels and component complexity classes when computing the average values, 
if the database is sufficiently large and detailed. 

In addition to the base probabilities, define the probability of redesign time 

7(fc) 

to be the probability that it will take exactly k slices to redesign the system if a 
problem was reported, provided that there are no further problem reports in the 
meantime. The probabilities of the redesign times get computed from the raw 
data sets of past projects just like the base probabilities. The values depend on 
the complexity of the software product’s design. There certainly is a bound on 
the number of slices that a redesign will take. Along with the bound mentioned 
above, the length of a phase is bounded. For later use, define 

k 

T{k) = 

s = 0 



4.4 Advance of a Team 

Suppose that team number i is in state fif=oo at the beginning of some phase. 
The probabilities describing how the team will advance during the phase can 
be computed using the team’s base probabilities. For example, the probability 
that the team will report a problem after k slices is equal to the conditional 
probability 

P(Ea + fc|B*J. 

Conditioning by the event 

Ci 

= U (e: U Dj ) 

S = 1 
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takes into account the knowledge that the team already has been working prior 
to the current phase for Q slices without having been interrupted. It follows 
from E ^ C B ^ , that 



P 



(E 



Ci+k 




P(Ej. + fc) 



Similarly, the probability that the team will have finished after I slices equals 

p(dU; I bU- 

The probability that the team will not report anything for a period of d slices 
is equal to 

P(BUd I B*J. 

As can be seen from the definition, the probability of event is given by 

Ci 

p(Bo) = 1 - E (p(e:) + p(d:)). 

S = 1 

If Ci / oo but P(BJ^) = 0, the conditional probabilities involving B^^ arc 
set to zero. If Ci = oo, they are set to one. 

4.5 Probability P ( u ) 

As a first step, the probabilities that the individual teams will advance according 
to the given course v are multiplied. For example, if the second team reports a 
problem at time kj this adds the factor 

P(PcEfci I BcE- 

Multiplying the individiial probabilities is appropriate because the problems re- 
ported by one or more teams during a phase will have no effect on the other 
teams until the end of that phase. By that time the new design is completed and 
the manager will interrupt the teams which are affected by changes. Therefore, 
the teams work independently of one another during the phase. 

As a second step, a factor of 

1 - T{kj + i-kj -1) 

has to be added for each j = 1 , ... n — 1 . This corresponds to the fact that 
redesigning didn’t get accomplished in the period between the problem reports at 
time kj and those at time fcj + 1 . Since the new design eventually is completed 
d — kn slices after the problem reports at time kn , this adds another factor 



j{d- kn) . 
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As a third step, since both v and C were arbitrarily chosen, probability ( w ) 
has to be set to zero if it is not possible for a phase to develop according to v 
when starting from ( . If w contains problem reports, the only condition is that 
teams which already are finished since an earlier phase won’t report anything. 



Define 



Ci = oo =t> i e Lo . 




to be one if v fulfils that condition and to be zero if not. If v does not contain 
problem reports, the phase ends if and only if all teams that still have been 
working at the beginning of the phase eventually reach state infinity, the last 
one after d slices, 

m 

Im ^ d and Lj = { f | Ci oo } . 

i = i 



Define 




to be one if v fulfils that (stronger) condition and to be zero if not. 

Summing up, the formula for probability ( u ) looks like this if there are 
problem reports during the phase : 



Pc(-) = n n I B*j 

j = l i Kj 
m 

X n n I Bf) 

j = l i Lj 

X n I B*J 

i Lq 
n—1 

X n ( ^ ^ + 1 “ ^ ) 

i = i 

X 7 ( d - fc„ ) X I ( u ) . 



If there are no problem reports during the phase, the formula looks like this : 

m 

pp-) = n n p(d;.+i, I b;,) 

i = 1 i Lj 
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4.6 Design Data 

To compute the transition probabilities, some input data about the degree of 
coupling between the software product’s components are required. For nonempty 
subsets K and X of the set of components {1, ... } the dependency degree 

a {K, X) 

is dehned to be the probability that changes in the system’s design will extend 
over exactly the components X given that the problems causing the redesign 
were detected in the components K . A component gets identified with its num- 
ber here to simplify notation. For fixed K, 

a{K,X) = 1 

X 

because design changes will extend over one and only one of the sets X . Since 
a problem that was detected in a particular component will result in changes to 
that component, it is required that 

K (S,X a{K, X) =0. 

The dependency degrees have to be extracted from the high-level design of the 
software product being developed. They can be viewed as a partial measure for 
the complexity of a high-level design. 

4.7 Probability P ( v) 

For this subsection, a state vector C oo and a course v of a phase are fixed. It 
suffices to consider only such pairs (^, v) for which = 1, respectively, 

(O') 

^ f ( ^ ) ~ the third step in subsection 4.5. 

Given C and v. not every state rj is a valid next state for the project. For 
example, teams that were in state infinity at the beginning of the phase and are 
not affected by design changes will still be in state infinity at the beginning of 
the next phase. Therefore, probability | v) has to be set to zero if ry is 

not a valid next state. 

Suppose that according to v there are problem reports during the phase. Denote 
by X the set containing the numbers of the teams which are affected by design 
changes. To characterize the valid next states, note that the next state already 
is uniquely determined when X is specified. This can be seen as follows. 

• From the definition for the state of a team it follows that the next state of 
a team which is affected by changes is zero : 



t e X 



r/i = 0. 
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• If a team has been working during the whole phase and is not affected by 
changes it will have advanced by d slices : 

i e Lo\X and (i < oo =A rji = Q + d . 

• If a team has been in state infinity and is not affected by changes it will 
remain in that state : 

i G Lq\X and Q = oo rji = oo . 

• If a team has finished working during the phase and is not affected by changes 
it will be in state infinity : 

i G Lj\X rji — oo . 

In addition, it is required that 

n 

K = C X 

i = i 

since a team which has reported a problem will be affected by changes. On the 
other hand, since only those teams which are affected by changes will have zero 
as their next state, the set X is determined if r] is given, 

X = {i \ rji ^ Q } . 

Therefore, a state vector 77 is a valid next state of the project if and only if the 
values of its nonzero entries are as described above and K <L { i \ fji = 0 } . 
Define 




to be one if 77 is a valid next state and to be zero if not. 

For a valid next state, the conditional probability ( 77 | u ) is equal to the 
probability that just the teams with numbers from X = {i\r]i = 0 } will be 
affected by changes. This probability equals the dependency degree 

a {K, {i I T7i = 0 }) . 

It implicitly is assumed here that the dependency degrees remain about the same 
through all changes of the system’s design. If K is nonempty, the formula for 
probability ( 77 | v) thus is given by 

PJ77 I u) = a {K, {i \ rji = 0 }) 

X 
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It remains to consider the case that according to v all teams have finished by 
the end of the phase. The only possible next state is rj = oo . If K is empty, 
the formula for probability ( r? | v) thus is 






1 if p = oo 

0 otherwise . 



4.8 Proofs 

The formulas for the transition probabilities {d, rj) yield a Markov process 
model only if for each state / oo the probabilities for the transitions from 
state C to some other state sum up to one. That is, 

^ P^ (d, p) = 1 . 

d , T ) 

The proof of this is involved and is given in [ 9 ] . 



5 Examples 

Some small examples are to show that the model behaves as expected when the 
input data are changed. Due to their large number, the computed transition 
probabilities are not listed. A table containing the numbers for all the charts 
printed below is available by email from the author. Recall that the following 
input data have to be specified : 

• the number of teams N ; 

• the base probabilities P ( ) and P ( ) for each team; 

• the probabilities of redesign times 7 ( fc ) ; 

• the dependency degrees a {K , . 

All probabilities are specified as percentages. 



5.1 Input Data 

There are three teams. A slice in discrete time corresponds to a month in real 
time. Each of the teams will have finished or will report a problem after at most 
six months, provided that it doesn’t get interrupted in the meantime. The base 
probabilities for the teams are : 



k 






D," 






n 


1 


0.0 


10.0 


10.0 


30.0 


10.0 


5.0 


2 


9.0 


9.0 


18.0 


12.0 


8.5 


8.5 


3 


21.6 


7.2 


15.0 


6.0 


20.4 


20.4 


4 


21.6 


4.3 


3.6 


3.6 


15.0 


2.7 


5 


12.1 


1.7 


0.5 


1.1 


7.1 


1.0 


6 


3.1 


0.4 


0.1 


0.1 


1.3 


0.1 
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Redesigns will be completed within a month, 

7(1) = 100. 

The values for the dependency degrees are given in the next table. Several entries 
are zero since this is required if A' X, see subsection 4.6. The non-zero entries 
are assumed to be uniformly distributed for each K in this example. 



a(K, X) 


1 


2 


3 


1,2 


1,3 


2,3 


1,2,3 


1 


25 


0 


0 


25 


25 


0 


25 


2 


0 


25 


0 


25 


0 


25 


25 


3 


0 


0 


25 


0 


25 


25 


25 


1,2 


0 


0 


0 


50 


0 


0 


50 


1,3 


0 


0 


0 


0 


50 


0 


50 


2,3 


0 


0 


0 


0 


0 


50 


50 


1,2,3 


0 


0 


0 


0 


0 


0 


100 



The table contains one line for each K and one column for each X. The set 
braces are left out. 



5.2 Results 

Suppose that a project has last subsection’s input data. The following chart 
shows up to x = 24 the values of the probabilities if (x) that the project will 
take exactly x months to complete. 







The peak value is 9.2%. The first value is zero since the probability P (DJ ) 
that the first team will have finished after one month is zero. The chances of 
succeeding within two years correspond to the sum of the areas of the bars, which 
is about 92% for this project. This answers for the sample project the question 
posed in the introduction. The 50 % threshold is reached after 9 months, the 
75 % threshold after 15 months. 
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Choose a final deadline of xq = 24 months. The estimate for the development 
time then is 

E time ~ 1 

months. To picture this using the chart, multiply each bar x with its height 
(p (x) and sum up. Then add a small adjustment which takes into account the 
remaining probability for exceeding 24 months of development time. 

The standard deviation for the chart is 6.5 months. The probability that the 
development time actually needed to complete the project does exceed the esti- 
mate, the delay being at most the standard deviation, is equal to 

(17) - (11) = 20%. 

This error bound corresponds to the sum of the areas of the bars numbered 12 
through 17 in the chart. 

5.3 Different Dependency Degrees 

To illustrate the influence of the values of the dependency degrees on the results, 
consider the ’’best case” and the ’’worst case” given by 

a{K,K) = 100 and a{K, {1,2,3}) = 100. 

The best case means that no teams will be affected by changes in the system’s 
design other than those which reported a problem. The worst case means that 
all teams will be affected by changes no matter which teams reported a problem. 
The remaining input data are fixed. The charts for the resulting functions ipi 
and if 2 look like this : 



Plix) P2{x) 




1 8 11 24 1 12 22 24 



For the best case, the chances of succeeding within two years have improved 
to 99 % . The 50 % threshold is reached after 8 months, the 75 % threshold 
after 11 months. For the worst case, the chances of succeeding within two years 
are only 79%. The 50% threshold is reached not until 12 months, the 75% 
threshold not until 22 months. The following charts compare the values to the 
initial example : 
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The differences show that the hrst bars for the best case lie above the corre- 
sponding bars of the initial example. There is a turning point because the total 
area of the bars in the chart of both the functions and (pi is bounded by 
100%. The increased weight of the hrst bars suffices to get a smaller estimate 
of 8.6 months for the development time. The standard deviation has a better 
value of 4.5 . The probability of exceeding the estimate by at most the standard 
deviation is 28 % . The results are the other way round for the worst case. The 
first bars lie below the corresponding bars of the initial example. Again there is 
a turning point. The worst case has a higher estimate of 13.5 months for the 
development time. The standard deviation is 7.6 and the error bound is 20 % . 

5.4 Different Base Probabilities 

The model behaves similarly when changing the values of the base probabilities. 
If the base probabilities of the second team are replaced with those of the first 
team, the results improve. On the other hand, if the base probabilities of the 
third team are replaced with those of the second team, the results get worse. 
The other input data are fixed. 

tp3{x) 




1 8 13 24 1 10 17 24 




The chances of succeeding within two years are 96 % and 89 % . The estimates 
for the development time are 9.9 and 11.8 months. The standard deviations 
are 5.7 and 6.8 months, the error bounds 25% and 22%. 
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Abstract. Previously, we defined a blackbox formal system modeling 
language called RSML (Requirements State Machine Language). The 
language was developed over several years while specifying the system 
requirements for a collision avoidance system for commercial passenger 
aircraft. During the language development, we received continual feed- 
back and evaluation by FAA employees and industry representatives, 
which helped us to produce a specification language that is easily learned 
and used by application experts. 

Since the completion of the RSML project, we have continued our re- 
search on specification languages. This research is part of a larger effort 
to investigate the more general problem of providing tools to assist in 
developing embedded systems. Our latest experimental toolset is called 
SpecTRM (Specification Tools and Requirements Methodology), and the 
formal specification language is SpecTRM-RL (SpecTRM Requirements 
Language) . 

This paper describes what we have learned from our use of RSML and 
how those lessons were applied to the design of SpecTRM- RL. We discuss 
our goals for SpecTRM-RL and the design features that support each of 
these goals. 



I Introduction 

In 1994, we published a paper describing a blackbox formal system modeling lan- 
guage called RSML (Requirements State Machine Language). The language was 
developed over several years during an effort to specify the system requirements 
for a collision avoidance system for commercial passenger aircraft called TCAS 

II (Traffic Alert and Collision Avoidance System). Because this was to be the 
official FAA (Federal Aviation Administration) spccihcation, it was developed 
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with continual feedback and evaluation by FAA employees, airframe manufac- 
turers, airline representatives, pilots, and other external reviewers. Most of the 
reviewers were not software engineers or even computer scientists, and we believe 
this helped in producing a specification language that is easily learned and used 
by application experts. RSML is still being used by the FAA, its subcontractors, 
and RTCA committees to specify the upgrades and changes to TCAS II. 

Those designing specification languages often have themselves in mind as po- 
tential users. However, our familiarity with certain notations, especially mathe- 
matical notations, such as predicate calculus, hides their weaknesses. Our first 
attempts at designing RSML, therefore, were failures: Our notation was clear to 
us but not to the ropresontatives from the airframe manufacturers, component 
subcontractors, airlines, and pilot groups that reviewed the TCAS specification 
during its development. The feedback from a diverse group of users helped us to 
evaluate the evolving specification language more objectively in terms of what 
did and did not need to be in the language; how to satisfy our language design 
criteria; and its practicality, feasibility, and usability. 

Due to pressure to meet FAA deadlines for getting TCAS II on aircraft, we 
were unable to use immediately all the lessons learned from that experience and 
apply it to the design of RSML. Since that time, we have specified several ad- 
ditional systems including a robot, flight management system, and air traffic 
control components, each time learning more lessons about the design of for- 
mal specification languages. Our research goal is to determine how specification 
languages can be designed to reflect these lessons. Our research paradigm is to 
determine important goals for specification languages from experience with in- 
dustrial applications, to generate hypotheses about how those goals might be 
aeeomplished, and then to instantiate these hypotheses in the design of a spee- 
ification language that we will use in future experimentation. In this way, we 
hope to build knowledge inerementally about how to most effectively design 
specification languages. 

Our specification language research is part of a larger research effort to in- 
vestigate the more general problem of providing tools to assist in developing 
embedded systems. Our latest experimental toolset is called SpecTRM (Spec- 
ification Tools and Requirements Methodology), and the formal specification 
language is SpecTRM-RL (SpecTRM Requirements Language). In addition to 
the general goals we had for designing RSML [9] , the lessons we have learned to 
date have focused our latest efforts on solving the following problems: 

1. Through the use of RSML, we have determined that readability and re- 
viewability by domain experts can be further enhanced by minimizing the 
semantic distance between the reviewer’s mental model and the constructed 
models. The problem we are now addressing is how to construct a modeling 
language that will allow and encourage modelers to reduce this semantic 
distance in the models they build. 

2. Specifiers are used to including internal design in their specifications and 
seem to have difficulty building pure blackbox requirements models. So a 
second goal was to provide more support and guidance in building software 
requirements (versus software design) models. 

3. We found certain common features of formal specification languages were 
very error-prone in use. In particular, the use of internal broadcast events 
accounted for most of the errors found by reviewers of the TCAS specification 
and also contributed substantially to the difficulty reviewers had in reading 
the models. A third goal for SpecTRM-RL was to determine if such internal 
events can be effectively eliminated from state-based modeling languages. 
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4. Formal models are expensive to produce. Thus, reuse of at least parts of the 
language should be supported by the language design. Such features should 
also support the design of models for product families. 

5. Accidents and major losses involving computers are usually the result of 
incompleteness or other flaws in the software requirements, not coding er- 
rors [8,12]. We previously defined a set of formal criteria to identify missing, 
incorrect, and ambiguous requirements for process-control systems [6,8]. En- 
gineers have made the criteria into checklists and used them on a variety 
of applications, such as radar systems, the Japanese module of the Space 
Station, and review criteria for FDA medical device inspectors. Two goals 
for SpecTRM-RL are to determine (1) how to enforce as many of the con- 
straints as possible in the syntax of the language and (2) how to design the 
language to enhance the ability to manually check or build tools to auto- 
matically check the specifications for the criteria that cannot be enforced by 
the language design itself. 

This paper describes what we have learned from our experimentation with 
the design of SpecTRM-RL about achieving the first four goals. Our results 
for the fifth goal will be described in a future paper. The design features of 
SpecTRM-RL that support each of these goals are described but a complete 
description of the language, including its syntax, is beyond the scope of this 
paper. We are currently producing a SpecTRM-RL language design manual and 
automated tools to assist in experimental use of the language. 



2 Enhancing Usability and Reviewability 

The primary goal for the design of a specification language should be to make 
the representation appropriate for the tasks to be performed by the users, i.e., to 
make the design user-centered (rather than designed primarily to make analysis 
easier or to be faithful to standard mathematical conventions). Software is a 
human product and specification languages are used to help humans perform 
the various problem-solving activities involved in software engineering. Our goal 
is to provide specifications that support human problem-solving and the tasks 
that humans must perform in software development and evolution as well as 
to allow automated analysis. We attempt to support hiiman problem-solving 
by grounding specification design on psychological principles of how humans use 
specifications to solve problems as well as on basic system engineering principles. 
We discuss two of these aspects here: minimizing semantic distance (problem 1 
above) and building blackbox specifications (problem 2 above). Problems 3 and 
4, as they reflect on the design of SpecTRM-RL, are discussed in later sections 
of this paper. 

An important psychological principle for enhancing reviewability is the con- 
cept of semantic distance [5] . We define an informal concept of semantic distance, 
similar to Norman’s use of the term, as the amount of effort required to translate 
from one model to another. We believe that in order to maximize the application 
expert’s ability to find errors in a requirements specification, the semantic dis- 
tance between their understanding of the required process control behavior (their 
mental model of the system) and the specification of that behavior must be min- 
imized. This, in turn, implies that the requirements be written entirely in terms 
of the components and state variables of the controlled system. Specifically, “pri- 
vate” variables and procedures (functions) related only to the implementation of 
the requirements and not part of the application expert’s view of the controlled 
system should not be used. That is, the specification should be black box. 
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A blackbox model of behavior permits statements and observations to be 
made only in terms of outputs and the inputs that stimulate or trigger those 
outputs. The model does not include any information about the internal design 
of the component itself, oidy its externally visible behavior. The overall system 
behavior is described by the combined behavior of the components, and the 
system design is modeled in terms of those component behavior models and the 
interactions and interfaces among the components. 

When the description of the required controller behavior includes more than 
just its blackbox behavior (e.g., it includes software design information), then the 
semantic distance between the required process-control behavior and the spec- 
ified controller behavior increases and the relationship between them becomes 
more difficult to validate (dj vs. in Figure 1). In fact, if adequately efficient 
code can be generated from the requirements specification directly, then an in- 
ternal design specification may never be needed. “Adequately efficient” must be 
determined for each specific application’s timing requirements. We are working 
on this code-generation problem [7]. 

In addition, the requirements review process involves validating the rela- 
tionship between changes in the real-world process and the specified changes 
and response in the control function model. Therefore, reviewability will be en- 
hanced if the requirements specification explicitly shows this relationship. We 
discuss this further in the next section. 



User’s Mental 




Blackbox 


Model of Desired 




Specification 


Process-Control 


di 


of Controller 


Behavior 




Behavior 



Design 

Specification 



Impiementation 



Fig. 1. Reviewability increases as the semantic distance between the user’s men- 
tal model of the desired behavior and the specification (di vs. ^ 4 ) decreases. 



Blackbox requirements specification languages not only enhance readability 
and reviewability, but they also simplify the transition from system requirements 
and system design to software requirements. The gap between system design and 
software requirements is frequently cited as a major problem in our interactions 
with industry. We believe some of the problem stems from the fact that software 
requirements often contain a lot of software design decisions, which makes the 
gap between the two specifications larger and more complex to negotiate. 



2.1 Minimizing Semantic Distance 

Our language is designed primarily for process-control systems. Therefore we 
attempt to minimize the semantic distance di by basing the specification lan- 
guage design on fundamental process control principles. In process control, the 
goal is to maintain a particular relationship or function F over time (t) between 
the input to the system Xg and the output from the system Og in the face of 
disturbances T> in the process (see Figure 2). This system function consists of the 
functional description and the set of constraints on the system [11]. At any mo- 
ment, there is a unique set of relationships between inputs and outputs whereby 
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each output value will be related to the past and present values of the inputs and 
time. These relationships will involve fundaniental chemical, thermal, mechani- 
cal, aerodynamic, or other laws as embodied within the nature and construction 
of the system. The system is constructed from components whose interaction 
implements F including, usually, a controller or controllers whose function is to 
ensure that F is correctly achieved. 




Fig. 2. Basic Process Control Loop 



A typical process-control system can be divided into four types of compo- 
nents: the process, sensors, actuators, and controller (sec Figure 2). 

Process: The behavior of the process is monitored through controlled variables 
(Vc) and controlled by manipulated variables {Vm)- The process can be de- 
scribed by the process function Fp , a mapping from Vm x x x t — > Og x Vc ■ 
Sensors: These devices are used to monitor the actual behavior of the process 
by measuring the controlled variables. For example, a thermometer may 
measure the temperature of a solvent in a chemical process or a barometric 
altimeter may measure altitude of an aircraft above sea level. The sensor 
function Fs maps Vc x t — > X. 

Actuators: These are devices designed to manipulate the behavior of the pro- 
cess, e.g., valves controlling the flow of a fluid or a pilot changing the direction 
and speed of an aircraft. The actuators physically execute commands issued 
by the controller in order to change the manipulated variables. The func- 
tionality of the actuators is described by the actuator function Fa mapping 
O X t ^ Vm- 

Controller: The controller is an analog or digital device used to implement the 
control function. The functional behavior of the controller is described by 
a control function {Fc) mapping I x C x t ^ O , where C denotes external 
command signals. The process may change state not only through inter- 
nal conditions and through the manipulated variables, but also by distur- 
bances {V) that are not subject to adjustment and control by the controller. 
The general control problem is to adjust the manipulated variables so as to 
achieve the system goals despite disturbances. Feedbaek is provided via the 
controlled variables in order to monitor the behavior of the process. This 
feedback information (along with external command signals C) can be used 
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as a foundation for future control decisions as well as an indicator of whether 
the changes in the process initiated by the controller have been achieved. 

To reason about this type of process-controlled system, Parnas and Madey 
defined what they call the four-variable model [14]. This model is essentially an 
abstraction of part of the traditional feedback process control model presented 
here. The approach to modeling used in Parnas Tables [13] and SCR [4,3] are 
based on this four variable model and, thus, built upon this same classic process 
control model. 

The model presented in this section is an abstraction — responsibility for im- 
plementing the control function may actually be distributed among several com- 
ponents including analog devices, digital computers, and humans. Furthermore, 
the controller may have only partial control over the process — state changes in 
the process may occur due to internal conditions in the process or because of 
external disturbances or the actuators may not perform as expected. For exam- 
ple, the pilot in a TCAS system may not follow the resolution advisory (escape 
maneuver) issued by the TCAS controller. 

The purpose of a control system requirements specification is to define the 
system goals and constraints, the function Fc (i.e., the required blackbox be- 
havior of the controller), and the assumptions about the other components of 
the process-control loop that (1) the implementors need to know in order to 
implement the control function correctly and (2) the system engineers and ana- 
lysts need to know in order to validate the model against the system goals and 
constraints. 

A blackbox, behavioral specilication of such a system function Fc uses only: 

(1) the current process state inferred from measurements of the controlled vari- 
ables, 

(2) past process states that were measured and inferred, 

(3) past corrective actions output from the controller, and 

(4) prediction of future states of the controlled process 

to generate the corrective actions (or current outputs) needed to maintain F . 

All of this information can be embedded in a state-machine model of the 
controlled process, and we specify the blackbox behavior of the controller (i.e., 
the function Fc to be computed by the controller) using such a state machine 
model. In SpecTRM-RL models, the outputs of the controller are specified with 
respect to state changes in the model as information is received about the current 
state of the controlled process via the controlled variables Vc- In the TCAS 
example, the control function is specified using a model of the state of all other 
aircraft within the host aircraft’s airspace, the state of the on-board components 
of its own aircraft (e.g., altimeters, aircraft discretes^, cockpit displays), and the 
state of ground-based radar stations in the vicinity. Information about this state 
is received from the sensors (e.g., antennas and transponders) and commands 
arc sent to the actuators (e.g., the pilot and transponders). 

The state machine model of the control function must be iteratively fine 
tuned during requirements development to mimic the current understanding of 
the real-world process and the required controller behavior. The state machine 
is essentially an abstraction of the behavior of the system function because it 
models all the relevant aspects of the components of the process control loop. 
Errors in the state machine model represent mismatches between this model and 
the desired behavior of the control loop, including the process. 

^ Aircraft discretes are airframe-specific characteristics provided as input to TCAS 
from hardware switches. 
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3 Building Blackbox Specifications in SpecTRM-RL 

Although RSML allows blackbox behavior specifications, the language itself 
does not encourage or enforce them. We found people tend to include design 
in the specification when using general state-machine modeling languages such 
as RSML or Statecharts. SpecTRM-RL, therefore, was not designed to be a 
general modeling language, but rather specifically designed to create blackbox 
requirements specifications to define an input/output process-control function, 
as is also true of SCR and Parnas Tables. General modeling features not needed 
for blackbox specifications are not included in SpecTRM-RL, and new abstrac- 
tions (such as modes) are included that assist in blackbox modeling of control 
system components. Thus, SpecTRM-RL is not just another variant of Stat- 
echarts although there are some superficial similarities. Like SCR and Parnas 
Tables, SpecTRM-RL enforces the specification of an input /output process con- 
trol function. Statecharts allows much more general models to be built. 

In order to make our language formal enough to be analyzable (and yet read- 
able and reviewable by non-mathematicians), we have defined a formal model 
(RSM or Requirements State Machine) that underlies a more readable specifi- 
cation language or languages. The RSM is a general behavioral model of the 
required control function with the components of the state machine mapped to 
the appropriate components of the control loop. This model has been published 
previously [6], and we do not refer to it further in this paper. We note only that 
the underlying model is a Mealy automaton, as is the model for SCR, Parnas 
Tables, Statecharts, and most other languages based on state-machines. 

The higher-level specification language based on this underlying model must 
allow the modeler to specify the required process-control function Fq- Figure 
3 shows a more detailed view of the components of the control loop, including 
distinguishing between human and automated controllers. 

All control software (and any controller in general) uses an internal model of 
the general behavior and current state of the process that it is controlling. This 
internal model may range from a very simple model including only a few variables 
to a much more complex model with a large number of state variables and 
transitions. The model may be embedded in the control logic of an automated 
controller or in the mental model maintained by a human controller. It is used 
to determine what eontrol actions arc needed. The model is updated and kept 
consistent with the actual system state through various forms of feedback. 

The design of SpecTRM-RL is influenced by our desire to perform safety 
analysis on the models. When the controller’s model of the system diverges from 
the actual system state, erroneous control commands (based on the incorrect 
model) can lead to an accident — for example, the software does not know that 
the plane is on the ground and raises the landing gear or it does not identify an 
object as friendly and shoots a missile at it. The situation becomes more compli- 
cated when there are multiple controllers (both human and automated) because 
their internal system models must also be kept consistent. In addition, human 
controllers interacting with automated controllers must also have a model of the 
automated controllers’ behavior in order to monitor or supervise the automation 
as well as the controlled system itself. 

One reason the models may diverge is that information about the process 
state has to be inferred from measurements. For example, in TCAS, relative 
range positions of other aircraft are computed based on round-trip message prop- 
agation time. Theoretically, the function Fc can be defined using only the true 
values of the controlled variables or component states (e.g., true aircraft po- 
sitions). However, at any time, the controller has only measured values of the 
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Fig. 3. A basic control loop. A blackbox requirements specification captures 
the controller’s internal model of the process. Accidents occur when the internal 
model does not accurately reflect the state of the controlled process. 
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component states (which may be subject to time lags or measurement inaccura- 
cies), and the controller must use these measured values to infer the true con- 
ditions in the process and possibly to output corrective actions (O) to maintain 
F. In the TCAS example, sensors include on-board devices such as altimeters 
that provide measured altitude (not necessarily true altitude) and antennas for 
communicating with other aircraft. The primary TCAS actuator is the pilot, 
who may or may not respond to system advisories. Pilot response delays are 
important time lags that must be considered in designing the control function. 
Time lags in the controlled process (the aircraft trajectory) may be caused by 
aircraft performance limitations. 

The automated controller also has a model of its interface to the human 
controllers or its supervisor (s). This interface, which contains the controls, dis- 
plays, alarm annunciators, etc. is important because it is the means by which the 
two controllers’ models are synchronized. Each of these components is included 
explicitly in our models and modeling language. We represent the controlled pro- 
cess and supervisory interface models using state machines and define required 
behavior in terms of transitions in this machine. The controller outputs (com- 
mands to the actuators) are specified with respect to state changes in the model 
as information is received about the current state of the controlled process via 
controlled variables read by sensors. 



Automated Controller Model 

Operating Modes 



Supervisory Interface 

Supervisory modes 

Controls 

Displays 



Controlled Process Model 

Process Operating Modes 
State Variables 

Process Interface Variables (measured 
and manipulated variables) 



Fig. 4. A SpecTRM-RL model has three parts. 



Thus a SpecTRM-RL specification of control software is composed of three 
interrelated models (Figure 4): (1) a specification of the operating modes of the 
controller, (2) a specification of the controller’s view of its supervisory interface 
(the component or components, including human operators, that are controlling 
it), and (3) a model of the controlled process. 



3.1 The Structure of a SpecTRM-RL Specification 

Engineers often use modes in describing required system functionality. Mode 
confusion also is frequently implicated in the analysis of operator errors that lead 
to accidents. We have included in SpecTRM-RL the ability to describe behavior 
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in terms of modes both to reduce semantic distance (and enhance reviewability) 
and to allow for analysis of various types of mode-related errors [ 10 ]. 

A mode can be defined as a mutually exclusive set of system behaviors. 
For example, the following table shows the possible transitions between states 
in a simple state machine given two system modes: startup mode and normal 
operation mode. 



Si S2 S3 S4 Ss 

Startup mode S3 S2 S4 S5 si 
Normal mode S3 S4 si S5 si 



Table 1. A simple state machine with two modes defined using a standard 
state transition table. The states in the machine (listed at the top of the table) 
are si through S 5 while the conditions under which the transition is made are 
listed on the left (e.g., startup mode and normal mode). Note that transitions 
may depend on more conditions than simply the current processing mode. 



The startup and normal processing modes in this machine determine how the 
machine will behave over the entire set of state transitions. For example, if the 
conditions occur that trigger a transition from state ,S 3 , the machine will transfer 
to state ,S 4 if it is in startup mode or to state ,Si if it is in normal processing 
mode. Note that modes are simply states that play a particular role in the 
state machine behavior (i.e., control a sequence or set of state transitions). That 
is, they are a convenient abstraction for describing and understanding complex 
system behavior, but they do not add any power to the state machine description. 
In general, state transitions may be triggered by events, conditions, or simply 
the passage of time. The current operating mode determines how these triggers 
will be interpreted and what transitions will be taken. Note that there is no 
real difference between a state and a mode by this definition. Any conditions 
or states could be labeled a “mode” (which indeed is done in some specification 
languages), although this is not very helpful and is not the way engineers use 
the term “mode” . 

Modern aircraft and other complex systems often have a large number of 
operating modes and possible combinations of operating modes. In the modeling 
and analysis of control systems, we find it useful to distinguish between three 
different types of modes: 

1. Supervisory modes determine who or what is controlling the component at 
any time. Control loops may be organized hierarchically, with multiple con- 
trollers or components, each being controlled by the layer above and con- 
trolling the layer below. In addition, each component may have multiple 
controllers (supervisors). For example, a flight control computer in an air- 
craft may get inputs from the flight management computer or directly from 
the pilot. Required behavior may be different depending on what supervi- 
sory mode is currently in effect. Mode-awareness errors related to confusion 
in coordination between the multiple supervisors of a control component can 
be defined in terms of these supervisory modes. 

2. Component operating modes control the behavior of the control component 
itself. They may be used to control the interpretation of the component’s 
interfaces or to describe the component’s required process-control behavior. 
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3. Controlled- system (or plant in control theory terminology) operating modes 
specify sets of related behaviors of the controlled system and are used to 
indicate its operational state. For example, an aircraft may be in takeoff, 
climb, level-flight, descent, or landing mode. 

The use of modes does not violate the blackbox nature of SpecTRM-RL; they 
represent externally visible behavior (required functionality) of the component 
and not the internal design of the software to achieve that functionality. For 
example, capture m,ode (which can be arm,ed or not armed) in the flight man- 
agement system example shown in Figure 5 indicates whether the aircraft will 
automatically level off when a pilot-specified altitude constraint is reached. The 
pilot is responsible for setting the altitude constraint and (usually) for directly 
or indirectly selecting capture mode. 

As stated earlier, a SpecTRM-RL specification has three interrelated models. 
The top box of Figure 5 shows the graphical part of an example specification of 
a flight management system. The system has seven modes of operation, all of 
which have only one value at any one time. The boxes shown under each mode 
label represents the discrete values for that mode, e.g., pitch can be in altitude 
hold, vertical speed, indicated air speed, or altitude capture mode. The line at the 
left of the choices simply groups the choices under the variable and indicates 
that only one may be active at any time and does not represent state transitions 
(as it did in RSML). 
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Fig. 5. Example of operating modes for a flight management system 



A second part of a SpecTRM-RL model is a specification of the component’s 
view of its supervisory interface. The supervisory interface consists of a model of 
the operator controls and displays or other means of communication by which the 
component relays information to the supervisor. Note that the interface models 
are simply the view that the component has of the interfaces — the real state of 
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the interface may be inconsistent with the assumed state due to various types 
of design flaws or failures. For example, a flight control computer in an aircraft 
may get inputs from the flight management computer or directly from the pilot. 
Required behavior may be different depending on what supervisory mode is 
currently in effect. By separating the assumed interface from the real interface, 
we are able to model and analyze the effects of various types of errors and 
failures (e.g., communication errors or display hardware failures). In addition, 
separating the physical design of the interface from the logical design (required 
content) will facilitate changes and allow parallel development of the software 
and the interface design. 

The third part of a SpecTRM-R,L model is the component’s model of the 
controlled system (plant). The description of a simple component may include 
only a few relevant state variables. If the controlled process or component is 
complex, the model of the controlled process may itself be represented in terms 
of its operational modes and the states of its subcomponents. In a hierarchical 
control system, the controlled process may itself be a controller of another pro- 
cess. For example, the flight management system may be controlled by a pilot 
and may itself issue commands to a flight control computer, which issues com- 
mands to an engine controller. If, during the design process, components that 
already exist are used, then those plug-in component models could be inserted 
into the SpecTRM-RL process model. 

If the SpecTRM-RL model is of a non-control component (e.g., a radar data 
processor), there might not be a supervisory interface. There will still be oper- 
ating modes, however, and a model of the required input-output function to be 
computed by the component. 

The language itself consists of a graphical model of the state machine, out- 
put message specifications, state variable definitions, operator interface variable 
definitions, state transition specifications, macros, and functions. 



Graphical State Machine. The SpecTRM-RL notation is driven by the use 
of the language to define a function from outputs to inputs. SpecTRM-RL has a 
greatly simplified graphical representation (compared to RSML or Statecharts), 
which is made possible because we eliminated the types of state machine com- 
plexity necessarily for specifying component design but not necessary to specify 
the input /output function computed in a pure blackbox requirements specifica- 
tion. The architecture of the state transitions becomes so simple that we found 
no need to represent it in the graphical state machine — the transitions simply 
represent the changes between state variable values. 

State values in square boxes represent inferred values. Inferred values are 
not input directly but represent the aspects of the process state model that 
must be inferred from measurements of monitored process variables. Inferred 
process states arc used to control the computation of the control function. They 
are necessarily discrete in value^, and thus can be represented as a finite .state 
variable. In practice, such state variables almost always have only a few relevant 
values (e.g., altitude below 500 feet, altitude between 500 feet and 10,000 feet, 
altitude above 10,000 feet). State values denoted as circles or ovals represent 
direct input and output values (controlled or monitored variables). 

The supervisory interface model shows the supervisory mode, which describes 
how this computer is being supervised, e.g., by a human, computer, or both 

^ If they are not discrete, then they are not used in the control of the function computa- 
tion but in the computation itself and can simply be represented in the specification 
by arithmetic expressions involving input variables. 
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(Figure 6). It also shows the state of the controls and the displays (including 
oral annunciations, etc.). 
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Fig. 6. Example of SpecTRM-RL model of the supervisory modes 



Output Message Specification. Everything starts from outputs in SpecTRM- 
RL. By starting from the output specification, the specification reader can de- 
termine what inputs trigger that output and the relationship between the inputs 
and outputs. RSML did not explicitly show this relationship (although it could 
bo determined, with some effort, by examining the speeifieation) . A simplified 
example is shown in Figure 7. More information is aetually required by our com- 
pleteness criteria than is shown in the example, for instance, specification of 
timing assumptions related to the message. 

The conditions under which an output is triggered (sent) is simply a predi- 
cate logic statement over the various states, variables, and modes in the specifi- 
cation. During the TCAS project, we discovered that the triggering conditions 
required to accurately capture the requirements were often extremely complex. 
The propositional logic notation traditionally used to define these conditions 
did not scale well to complex expressions and quickly became unreadable (and 
error-prone). To overcome this problem, we decided to use a tabular represen- 
tation of disjunctive normal form (DNF) that we call and/or tables. We have 
maintained this successful notation in SpecTRM-RL. The far- left column of the 
and/or table lists the logical phrases. Each of the other columns is a conjunc- 
tion of those phrases and contains the logical values of the expressions. If one 
of the columns is true, then the table evaluates to true. A column evaluates to 
true if all of its elements match the truth values of the associated predicates. A 
dot denotes “don’t care.” 

The subscripts in the specification represent whether the value is a variable 
(v) or a state (s). The other alternatives, macros (m) and functions (f) are 
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Output Message] — 

Resolution Advisory 



TRIGGERING CONDITION 



Composite-RA^ 266 state RA 


T 


Traffic-Display-Status^ in state Waiting-To-Send 


T 



MESSAGE CONTENTS 



FIELD 


VALUE 


Bits 11-17 


Own-Goal-Altitude-Rate^ 219 


Bits 18-20 


Combined-Control^ 227 


Bits 21-23 


Vertical-Control^ 231 


Bits 24-26 


Climb-RA^, 233 


Bits 27-29 


Descent-RA^ 235 



Fig. 7. Example of SpecTRM-RL output message specification 

described later in this paper. The number attached to the subscript is the page 
on which the variable, state, macro, or function is dehned. 

State Variable Definition. State variable values come from inputs or they 
may be computed from such input values or inferred from other state variable 
values. Figure 8 shows a partial example of a state variable description. Again, 
our desire to enforce completeness requires that state variable definitions in- 
clude such information as arrival rates, exceptional condition handling, data age 
requirements, feedback information, etc. not shown here. 

SpecTRM-RL requires all state variables that describe the process state to 
include an unknown value. This value is the default value upon startup or upon 
specific mode transitions (for example, after temporary shutdown of the com- 
puter). This feature is used to ensure consistency between the computer model 
of the process state and the real process state by forcing resynchronization of 
the model with the outside world after an interruption of processing. Many acci- 
dents have been caused by the assumption that the process state does not change 
while the computer is not processing inputs or by incorrect assumptions about 
the initial value of state variables. 

Unknown is used for state variables in the supervisory interface model only 
if the state of the display can change independently of the software. Otherwise, 
such variables must specify an initial value (e.g., blank, zero, etc.) that should 
be sent when the computer is restarted. 

In the example shown, vertical control is a state variable in the supervisory 
interface model and is one of the pieces of information displayed to the pilot as 
part of an RA (Resolution Advisory, which is the escape maneuver the pilot is 
to implement to avoid the intruder aircraft). Vertical control can have the values 
Unknown, Other, Increase, Crossing, Maintain, or Reversal, and/or tables are 
used to specify which of these values is displayed to the pilot (given the current 
state of the aircraft model and the intruder aircraft being avoided). For example. 
Maintain is displayed if the Composite-RA state variable is in state “Climb”, 
there is no RA-Strength in state “Increase-2500fpm” , and Corrective-Climb and 
Corrective-Descend are both not in state “yes” or Maintain is displayed if there 
is no RA-Strength in state Increase-2500fpm, Composite RA is in state Descend, 
and again both Corrective-Climb and Corrective-Descend are not in state “yes.” 
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1 State I 

Supervisory-Interface 

^Pilot-Displays 

Resolution-Advisory 

'^Vertical-Control 

DEFINITION 



= Blank 

INITIALLY 



= Other 



Some RA-Strength^ 277 state Increase-2500fpm 




F 


F 


F 


Some Reversal^ 282 state Reversed 


F 


F 


F 


Composite-RA^ 266 state Climb 


F 


• 


• 


Composite-RA^ 266 state Descend 


F 


• 


• 


Corrective-Climbg 263 state Yes 


• 


F 


• 


Corrective-Descend^_264 in state Yes 


• 


F 


Crossing-Geometry^ 


F 


F 


F 



= Increase 



Some RA-Strength^ ^27 *n state Increase-2500fpm 




T 


Climb-Inhibit^ 243 in mode Inhibited 




T 


Descend-Inhibit ,,, in mode Inhibited 

s -245 




T 



= Crossing 



Some Reversal^ 282 state Reversed 




F 


F 


F 


F 


Composite-RA^ 266 state Climb 




T 


T 


• 


• 


Composite-RA^ 266 state Descend 




• 


• 


T 


T 


Some RA-Strength^ 277 in state Increase-2500fpm 




F 


F 


F 


F 


Corrective-Climbg 263 i^ state Yes 




F 


• 


F 


• 


Corrective-Descend ... in state Yes 

s -264 




• 


F 


• 


F 


Crossing-Geometryj^_3gg 




F 


F 


F 


F 



= Maintain 



Composite-RA^ 266 state Climb 




T 




• 




Some RA-Strength^ ^27 in state Increase-2500fpm 




F 




F 




Composite-RA^ 266 state Descend 








T 




Corrective-Climbg 263 i^ state Yes 




F 




F 




Corrective-Descend^_264 in state Yes 




F 




F 




= Reversal 








Some Reversal^ 282 state Reversed 


1 


T 




T 


T 


Composite-RA^ 266 state Climb 


1 


F 






• 


Composite-RA^ 266 state Descend 




F 




• 


• 


Corrective-Climb^ 263 i^ state Yes 


1 


T 




• 


T 


Corrective-Descend^_264 in state Yes 


! 






• 


T 



Fig. 8. Example of SpecTRM-RL state variable specification 
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Timing constraints may also be specified as conditions in the tables (i.e., condi- 
tions on the state transitions) but are not required in this example. 



State Transition Specification. As with all state-machine models, transitions 
in the three parts of a SpecTRM-RL model are governed by external events and 
the current state of the modeled system. In SpecTRM-RL, the eonditions under 
which transitions are taken are specified separately from the graphical depiction 
of the state machine. We have found that the behavior of real systems is too 
complex to write on a line between two boxes. Instead, we again use and/or 
tables. Figure 9 shows an example specification for a transition. 



1 Modir 

Own-Aircraft-Operating-Modes 




DEFINITION 



INITIALLY — > Unknown 

|~ true I 



Unknown, Not-Inhibited — ^Inhibited 



Composite-RA^ 266 state No-RA 




T 


T 


Altitude-Climb-Inhibit,^^ 259 = True 




T 




Own-Tracked- Altj 4g2 > Aircraft- Altitude-Limit^ 259 




T 


• 


Config-Climb-Inhibit^, 259 = True 




• 


T 



Unknown, Inhibited — >Not-Inbibited 



Composite-RA^ 266 state No-RA 




T 




T 


Altitude-Climb-Inhibit^ 259 = True 




F 




• 


Own-Tracked-Alt^^g2 > Aircraft- Altitude-Limit^, 259 




• 




F 


Config-Climb-Inhibit^ 259 = True 




F 




F 



Fig. 9. Example of SpecTRM-RL mode or state transition specification 



Macros and Functions. Macros are simply named pieces of AND/OR ta- 
bles that can be referenced from within another table. For example, the macro 
in Figure 10 is used in the definition of the variable Vertical-Control in Fig- 
ure 8. The macros, for the most part, correspond to typical abstractions used 
by application experts in describing the requirements and therefore add to the 
understandability of the specification. In addition, the abstraction is necessary 
to handle the complexity of guarding conditions in larger systems and we found 
this a convenient abstraction to allow hierarchical review and understanding of 
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the specification. Also, rather than including complex mathematical functions 
directly in the transition tables, functions are specified separately and referenced 
in the tables. For instance, Own- Tracked-Alt in Figure 9 is a function reference. 

The macros and function, as well as the concept of parallel state machines, not 
only help structure a model for readability; they also help us organize models to 
enable specification reuse. Conditions commoidy used in the application domain 
can be captured in macros and common functions, such as tracking, can be 
captured in reusable functions. In addition, the parallel state machines allow 
the internal model of each system component (discussed in Section 3) and the 
different system modes to be captured as separate and parallel state machines. 
This helps to accommodate reuse of internal models and operational modes, and 
helps us plan for product families (research goal 4 in the introduction). Naturally, 
to accomplish reuse, care has to be taken when creating the original model to 
determine what parts are likely to change and to modularize these parts so that 
substitutions can be easily made. This structuring, however, is beyond the scope 
of the current paper. 



1 Macro | 

Crossing-Geometry 



DEFINITION 



Some Crossing^ 280 state Int-Cross 




T 


• 


Some Crossing, 280 state Own-Cross 




• 


T 



Fig. 10. Example of SpecTRM-RL macro specification 



4 Eliminating Internal Broadcast Events 

A third goal for SpecTRM-RL was to eliminate error-prone constructs. Dur- 
ing the independent verification and validation (IV&V) of TCAS II, problems 
with internal broadcast events (used to synchronize parallel state machines in 
Statecharts and RSML) accounted for a clear majority of the errors related to 
the syntax and semantics of RSML. Common and difficult to resolve problems 
involved proper synchronization of mutually interdependent state machines. In 
addition, getting the state machines to correctly model system startup behav- 
ior proved to be surprisingly difficult. Internally generated events seem to cause 
“accidental complexity” in the specification that is not necessarily present in the 
problem being specified. 

These problems were not just the most common language-related problems in 
the initial specification, they were also the problems that lingered unresolved (or 
incompletely resolved) through several cycles of corrections and repeated IV&V. 
Note that the problems related to synchronization were not directly caused by 
misunderstandings of the RSML event/action semantics; the event/action mech- 
anism is quite simple and purposely selected to be intuitive [9] . Instead, the prob- 
lems were caused by the complexity of the model and the inherent difficulty of 
comprehending the causal relationships between parallel state machines. Thus, 
this difficulty is not unique to RSML, it is fundamentally difficult to understand 
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parallelism and synchronization. Other state-based languages such as Statecharts 
[1] and the UML behavioral (state machine) models that use event/action se- 
mantics are likely to encounter the same problem when used to model complex 
systems. When we eliminated internal events, we were surprised at how much 
easier it was to rewrite our old specifications (such as TCAS II) and to create 
new ones. 

The trigger events and actions on the transitions in Statecharts and RSML 
are used for two purposes. First, they are used to sequence and synchronize 
state machines so the next state relation is computed correctly. For instance, 
to determine if an intruding aircraft is a collision threat in TCAS II, we must 
first determine how close the intruder is and how close the intruder is allowed to 
come before it is considered a threat. Thus, the state variables determining the 
intruder status and the sensitivity level of TCAS II must bo evaluated before wo 
determine advisory-status. This is a straightforward (but as mentioned above, 
error prone) use of events and actions. 

Second, events and actions may be employed to use, in essence, the state 
machines as a programming language. The events can be used to create loops 
and counters, and events can be implicitly assigned semantic meaning and used 
for purposes other than synchronization. In our experience we have found this 
freedom of using the events a trap that invites the introduction of design details 
in the specification. During the development of the TCAS II model we had to 
repeatedly remind ourselves to use events prudently; we have found that even 
experienced users of such modeling languages inevitably fall into the trap of 
using events and actions to introduce too much design in the models. 

To solve this problem in SpecTRM-RL, we simply decided not to use internal 
events and instead to rely on the data dependencies in the model to determine 
the order in which transitions arc taken, i.c., the ordering, if critical, is explicitly 
included in the model as opposed to being built into the semantics of the mod- 
eling language. In this way, the reviewer need not rely on knowledge about the 
semantics of the modeling language to determine if the model correctly matches 
the intended functional behavior — that behavior (which state transitions follow 
which) is explicitly specified in the constructed model. A similar argument holds 
for the modeler. We found that different reviewers of our TCAS specification 
were assigning differing semantics to the state transition ordering. In the ex- 
ample above, the transitions in the state machine advisory status refer to the 
states of intruder status and sensitivity level; thus, intruder status and sensitiv- 
ity level will be evaluated before advisory status. This sequencing will assure a 
correct evaluation of the next state relation based on the data dependencies of 
the transitions and variables. The next state function is recomputed every time 
the environment changes an input variable. Naturally, a SpecTRM-RL specifica- 
tion cannot include any cycles in the data dependencies. Cycles in a specification 
can be easily detected by our tools. 

In Statecharts and RSML, a transition is not taken until an explicit event is 
generated. When the transition is taken, additional events may be generated as 
actions. In this way, the events propagate through the state machine triggering 
transitions. In our formalization of the semantics of RSML [2] we view each 
transition as a simple function mapping one state to the next. The events and 
actions on the transitions are used to determine in which order we shall use these 
functions to compute the next state. We define the new semantics of SpecTRM- 
RL in essentially the same way as for RSML. The only difference is how we 
determine in which order to apply the functions representing transitions. We 
now rely entirely on the data dependencies between the transitions to determine 
a partial order that is used during the computation of the next state relation. 
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The semantics of SpecTRM-RL have been defined formally, but this definition 
is not included for space reasons. 



5 Conclusions 

In this paper, wc described some lessons learned during experimentation with a 
formal specification language (RSML) and how wc have used what we learned 
to drive further research. We showed how a formal modeling language can be 
designed to assist system understanding and the recpiirements modeling effort. 
We achieve this by grounding the design of the language in the domain for 
which it is intended (process control) and how people actually think about and 
conceptualize complex systems. 

We have applied these principles to the design of a new experimental lan- 
guage called SpccTRM-RL. As mentioned above, SpccTRM-RL evolved from 
our previous experiences with using RSML to specify large and complex systems. 
In particular, we addressed the problems associated with inclusion of excessive 
design in the blackbox specification and internal broadcast events. Our expe- 
rience thus far indicates that the new language design principles introduced in 
SpccTRM-RL greatly enhance the usability of a formal notation. 
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Abstract. Recently, many formal methods, such as the SCR (Software 
Cost Reduction) requirements method, have been proposed for improving 
the quality of software specifications. Although improved specifications 
are valuable, the ultimate objective of software development is to produce 
software that satisfies its requirements. To evaluate the correctness of a 
software implementation, one can apply black-box testing to determine 
whether the implementation, given a sequence of system inputs, produces 
the correct system outputs. This paper describes a specification-based 
method for constructing a suite of test sequences, where a test sequence 
is a sequence of inputs and outputs for testing a software implementation. 
The test sequences are derived from a tabular SCR requirements spec- 
ification containing diverse data types, i.e., integer, boolean, and enu- 
merated types. From the functions defined in the SCR specification, the 
method forms a collection of predicates called branches, which “cover” 
all possible software behaviors described by the specification. Based on 
these predicates, the method then derives a suite of test sequences by 
using a model checker’s ability to construct counterexamples. The paper 
presents the results of applying our method to four specifications, includ- 
ing a sizable component of a contractor specification of a real system. 



1 Introduction 

During the last decade, numerous formal methods have been proposed to im- 
prove software quality and to decrease the cost of software development. One 
of these methods, the SCR (Software Cost Reduction) method, is based on a 
user-friendly tabular notation and offers several automated techniques for de- 
tecting errors in software requirements specifications, including an automated 
consistency checker to detect missing cases and other application-independent 
errors [14]; a simulator to symbolically execute the specification to ensure that 
it captures the users’ intent [13]; and a model checker to detect violations of 
critical application properties [3,12]. Recently, groups at NASA and Rockwell 
Aviation as well as our group at NRL have used the SCR techniques to detect 
serious errors in requirements specifications of real-world systems [7,21,12]. By 
exposing defects in the requirements specification, such techniques help the user 
improve the specification’s quality. This improved specification provides a solid 
foundation for the later phases of the software development process. 

While high-quality requirements specifications are clearly valuable, the ulti- 
mate objective of the software development process is to produce high-quality 
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O. Nierstrasz, M. Lemoine (Eds.): ESEC/FSE ’99, LNCS 1687, pp. 146-162, 1999. 
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software, i.e., software that satisfies its requirements. To weed out software errors 
and to help convince customers that the software performance is acceptable, the 
software needs to be tested. An enormous problem, however, is that software test- 
ing, especially of safety-critical systems, is extremely costly and time-consuming. 
It has been estimated that current testing methods consume between 40% and 
70% of the software development effort [2]. 

One benefit of a formal method is that the high-quality specification it pro- 
duces can play a valuable role in software testing. For example, the specification 
may be used to automatically construct a suite of test sequences. These test 
sequences can then be used to automatically check the implementation software 
for errors. By eliminating much of the human effort needed to build and to apply 
the test sequences, such an approach should reduce both the enormous cost and 
the significant time and human effort associated with current testing methods. 

This paper describes an original method for generating test sequences from 
an operational SCR requirements specification containing mixed variable types, 
i.e., integers, boolcans, and enumerated types. In our approach, each test se- 
quence is a sequence of system inputs and their associated outputs [19]. The 
requirements specification is used both to generate a valid sequence of inputs 
and as an oracle [17] that determines the set of outputs associated with each 
input. To obtain a valid sequence of inputs, the input sequence is constrained 
to satisfy the input model (i.e., assumptions about the inputs) that is part of 
the requirements specification. Our method for generating test sequences "cov- 
ers” the set of all possible input sequences by organizing them into equivalence 
classes and generating one or more test sequences for each equivalence class. 

Section 2 reviews the SCR method, and Section 3 describes the objectives 
of an effective suite of test sequences. After showing how test sequences can 
be derived from system properties. Section 4 presents our original method for 
generating test sequences from operational requirements specifications and the 
branch coverage criterion that the method applies. Section 5 describes a tool we 
developed that uses either of two model checkers to automatically generate test 
sequences; it also presents the results of applying the tool to four specifications. 
Section 6 reviews related work, and Section 7 presents a summary and plans for 
fiiture work. 

2 Background: The SCR Requirements Method 

The SCR method was formulated in 1978 to specify the requirements of the Op- 
erational Flight Program (OFP) of the U.S. Navy’s A-7 aircraft [15]. Since then, 
many industrial organizations, including Bell Laboratories, Grumman, Ontario 
Hydro, and Lockheed have used SCR to specify the requirements of practical 
systems. The largest application to date occurred in 1994 when Lockheed en- 
gineers used SCR tables to document the complete requirements of Lockheed’s 
C-130J OFP [9], a program containing more than 250K lines of Ada. Each of 
these applications of SCR had, at most, weak tool support. To provide powerful, 
robust tools customized for the SCR method, we have developed the SCR toolset, 
which includes the consistency checker, simulator, and model checker mentioned 
above. To provide formal underpinnings for the tools, we have formulated a for- 
mal model which defines the semantics of SCR requirements specifications [14]. 

An SCR requirements specification describes both the system environment, 
which is nondeterministic, and the required system behavior, which is usually 
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deterministic [14]. The SCR model represents the environmental quantities that 
the system monitors and controls as monitored and controlled variables. The 
environment nondeterministically produces a sequence of input events, where an 
input event signals a change in some monitored quantity. The system, represented 
in the model as a state machine, begins execution in some initial state and then 
responds to each input event in turn by changing state and by possibly producing 
one or more output events, where an output event is a change in a controlled 
quantity. An assumption of the model is that at each state transition, exactly one 
monitored variable changes value. To concisely capture the system behavior, SCR 
specifications may include two types of auxiliary variables, mode classes, whose 
values arc modes, and terms. Mode classes and terms often capture historical 
information. 

In the SCR model, a system is represented as a 4-tuple, (S', Sq, E'"^ ,T), where 
S is the set of states, S'o G 5 is the initial state set, is the set of input events, 
and T is the transform describing the allowed state transitions [14]. Usually, the 
transform T is deterministic, i.e., a function that maps an input event and the 
current state to a new state. To construct T, we compose smaller functions, each 
derived from the two kinds of tables in SCR requirements specifications, event 
tables and condition tables. These tables describe the values of each dependent 
variable, that is, each controlled variable, mode class, or term. The SCR model 
requires the entries in each table to satisfy certain properties. These properties 
guarantee that all of the tables describe total functions. 

In SCR, a state s is a function that maps each variable in the specification 
to a type-correct value, a condition is a predicate defined on a system state, 
and an event is a predicate defined on a pair of system states implying that the 
value of at least one state variable has changed. When a variable changes value, 
we say that an event “occurs”. The expression “@T(c) WHEN d” represents a 
conditioned event, which is defined by 

H pf- 

®T(c) WHEN d = ^cAc WHEN d, 

where the unprimed conditions c and d are evaluated in the current state and 
the primed condition c is evaluated in the next state. 

3 Attributes of an Effective Suite of Test Sequences 

A practical method should be supported by “pushbutton” techniques, techniques 
that can be invoked with the mere push of a button. One example of a pushbut- 
ton technique is automated consistency cheeking [14]. Our goal is to also make 
software testing a pushbutton technique, i.e., as automatic as possible. Our ap- 
proach to software testing focuses on conformance testing, black-box testing 
that determines whether an implementation exhibits the behavior described by 
its specification. This approach divides testing into two phases. During the first 
phase, an operational requirements specification is used to automatically con- 
struct a suite of test sequences. During the second phase, a test driver feeds 
inputs from the test sequences to the software implementation, and then a com- 
parator compares the outputs produced by the implementation with the outputs 
predicted by each test sequence, reporting all discrepancies between the two sets 
of outputs. Clearly, discrepancies between the two sets of outputs expose cases 
in which the software implementation violates the requirements specification. 
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The challenge of software testing methods is to produce an effective suite of 
test sequences. Like Fujiwara et al. [10], we believe that an effective suite of test 
sequences satisfies two (conflicting) objectives: 

— The number of test sequences in the suite should be small. Similarly, the 
number of test data (i.e., the length of the input sequence) in each test 
sequence should also be small. 

— The test suite should “cover’" all errors that any implementation may contain. 
That is, it should evaluate as many of the different possible behaviors of the 
software as possible. 

In our approach, each test sequence is a complete scenario, which starts in 
a legal initial state and which contains, at each state transition, a valid system 
input coupled with a set of valid system outputs. Both the assumptions that 
constrain the system inputs and how the system outputs are computed from the 
system inputs are described by an SCR requirements specification. 

4 Generating Test Seqnences with a Model Checker 

Normally, a model checker is used to analyze a finite-state representation of a 
system for property violations. If the model checker analyzes all reachable states 
and detects no violations, then the property holds. If, in contrast, the model 
checker finds a reachable state that violates the property, it returns a “coun- 
terexample,” a sequence of reachable states beginning in a valid initial state and 
ending with the property violation. We use model checking not for verification 
nor to detect specification errors but, like some others [1,5,8], to construct test 
sequences. Like these others, we base our method on two ideas. First, the model 
checker is used as an oracle to compute the expected outputs. Second, the model 
checker’s ability to generate counterexamples is used to construct the test se- 
quences. To force the model checker to construct the desired test sequences, we 
use a set of properties called trap properties. 

Section 4.1 describes how trap properties are derived from system proper- 
ties provided by designers or customers. Then, Section 4.2 describes an original 
extension of this method which derives trap properties systematically and au- 
tomatically from an operational SCR requirements specification. Deriving trap 
properties in this manner ensures that the test sequences “cover” all possible 
behaviors described by the specification. To demonstrate that our method can 
be used with different model checkers, we use two different model checkers to 
construct the test sequences. The example presented in Section 4.1 uses the 
symbolic model checker SMV [20], whereas the example presented in Section 4.2 
uses the explicit state model checker Spin [16]. To translate an SCR specification 
into either the language of Spin or the language of SMV, we use the translation 
method described in [3]. A Spin specification and an SMV specification obtained 
from an SCR specification using this translation method are semantically equiv- 
alent. Section 4.3 describes our coverage criterion, a form of branch coverage. 

4.1 Generating Test Sequences from Properties 

To introduce our method, we illustrate how the model checker SMV may be 
used to obtain a test sequence from a system property and an SCR requirements 
specification. We consider a system called the Safety Injection System (SIS), a 
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simplified version of a control system for safety injection in a nuclear plant [6], 
which monitors water pressure and injects coolant into the reactor core when the 
pressure falls below some threshold. The system operator may override safety 
injection by turning a “Block” switch to “On” and may reset the system after 
blockage by setting a “Reset” switch to “On” . 

To specify the SIS requirements in SCR, we represent the SIS inputs with 
the monitored variables WaterPres, Block, and Reset and the single SIS output 
with a controlled variable Saf etyinjection. The specification also includes two 
auxiliary variables, a mode class Pressure, an abstract version of WaterPres, 
and a term Overridden which indicates when safety injection has been overrid- 
den. An important component of the SIS specification is the input model for 
WaterPres, which constrains WaterPres to change by no more than 3 psi^ from 
one state to the next. The input model that describes the possible changes to 
WaterPres is defined by {(ui,r(; ) : |w — rt;| < 3, 0 < ui, re < 30}, where w 
represents the value of WaterPres in one state and w represents its value in 
the next state. In this example, a constant Low=10 defines the threshold that 
determines when WaterPres is in an unsafe region. 

Suppose that we have verified that the operational SCR specification of SIS 
satisfies a safety property called P, which is dehned by 

@T(WaterPres < Low) WHEN Block = On A Reset = Off => Saf etyinjection = Off. 

Property P states that if WaterPres drops below the constant Low when Block 
is On and Reset is Off, then Saf etyinjection must be Off. 

We can use the property P and the operational SCR specification of the SIS 
to construct a test sequence as follows. First, the operational specification is 
translated into the SMV language in the manner described in [3] . If our goal was 
to verify P, we would next translate P into CTL, the temporal logic of SMV. 
Because our goal is not to verify P but to construct a test sequence from P, 
instead we translate the negation of P’s premise into CTL, i.e., 

AG!( EX(WaterPres<Low) & ! WaterPres<Low & Block = On & Reset = Off ), 

where AG! represents ‘never’, EX represents ‘next’, and I represents negation. 
Because the negation of P’s premise is false in the SCR speciheation, running 
SMV detects a violation. To demonstrate the violation, SMV produces a coun- 
terexample, i.e., a trace of input events which starts in a valid initial state and 
ends when a violation of the CTL property is detected. This trace provides the 
basis for the desired test sequence. The CTL property is an example of a trap 
property. 

Table 1 illustrates the test sequence that can be constructed from the coun- 
terexample produced when SMV detects a violation of the above trap property 
in the SIS speciheation. In the table, the initial values of WaterPres, Block, 
Reset, Saf etyinjection, and Pressure are shown in step 0, which represents 
the initial state. (Due to lack of space, Table 1 omits the term Overridden.) 
To clarify which variable values change from one state to the next. Table 1 only 
shows the variable values which change at each step and omits the values that 
remain the same. Note that the changes in WaterPres from one state to the 
next never exceed 3 psi and thus satisfy the constraints of the input model. 

^ The abbreviation “psi” represents “pounds per square inch.” 
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Step 


Monitored Var. 


Controlled Var. 


Mode Class 


No. 


Value 


Value 


Value 


0 


WaterPres=2 

Block=0ff 

Reset=0n 


Saf etylnjection=0n 


Pressure=TooLow 


1 


Reset=0ff 






2 


WaterPres=5 






3 


WaterPres=8 






4 


WaterPres=10 


Saf etylnjection=0f f 


Pressure=Permitted 


5 


Block=0n 






6 


WaterPres=8 




Pressure=TooLow 



Table 1. Test Sequence Constructed from SMV Counterexample. 

The six inputs that lead to the violation of the trap property form the input 
sequence for the test sequence. Table 1 shows that the only change to the SIS 
output produced by this input sequence is the change at step 4 in the value of 
Safetyinjection. 

The test sequence of length six shown in Table 1 may be represented more 
concisely as 

< (coff;-),(w,5;-),(u;,8;-),(w,10:s,oft),(6,on;-),(w,8;-) >, (1) 

where r, w, and h represent the input variables Reset, WaterPres, and Block; 
s represents the single output variable Safetyinjection; and — indicates that 
no output variable changes.^ Clearly, checking the software behavior with this 
test sequence will test whether the software satisfies property P. In addition to 
changes in output values, a test sequence may also include changes in the values 
of one or more auxiliary variables. For example, the test sequence in (1) could 
be extended to include changes in the mode class Pressure, which changes (see 
Table 1) to Permitted at step 4 and to TooLow at step 6. 

Although this method can test many critical aspects of the system behavior, 
it has several weaknesses. First, the method assumes that the customers (or the 
designers) have formulated a set of system properties. Unfortunately, formulat- 
ing such properties is not a normal step in requirements specification, and hence 
such properties may not be available. Second, and more important, is the incom- 
pleteness of the test sequences. Even if a large set of properties are available for 
generating test sets, questions remain about how completely the test sequences 
cover all possible system behaviors. Finally, the method assumes the correctness 
of both the operational specification and the properties. Our experience with 
SCR specifications and the SCR tools convinces us that achieving a high-quality 
SCR specification is feasible. In contrast, verifying that the specifications satisfy 
a given set of properties is more problematic. Although there has been recent 
progress in using model checkers and theorem provers to verify properties, for- 
mal verification still suffers from both theoretical problems (e.g., problems of 
decidability) and practical problems (time and space necessary for the proofs). 

^ In this example, the initial state is unique and thus may be omitted from the test 
sequence. 






152 A. Gargantini and C. Heitmeyer 



4.2 Generating Test Sequences from an Operational Specification 

This section describes an original method for constructing test sequences which 
does not depend on a set of system properties. This method automatically trans- 
lates an operational requirements specification in the SCR notation to the lan- 
guage of a model checker and automatically and systematically generates test 
sequences. This section first describes how test seqTicnccs can be generated from 
an event table and next how they can be generated from a condition table. 



Old Mode 


Event 


New Mode 


TooLow 


@T(WaterPres > Low) 


Permitted 


Permitted 


@T(WaterPres > Permit) 


High 




@T(WaterPres < Low) 


TooLow 


High 


@T(WaterPres < Permit) 


Permitted 



Table 2. Event Table Defining the Mode Class Pressure. 



Generating Test Sequences from an Event Table. To illustrate the method, 
we consider the event table in the SIS specification (see Table 2) which defines 
the value of the mode class Pressure. The mode class has three modes: TooLow, 
Permitted, and High. At any given time, the system must be in one and only 
one of these modes. A drop in water pressure below the constant Low causes the 
system to enter mode TooLow; an increase in pressure above a larger constant 
Permit=20 causes the system to enter mode High. Figure 1 shows the function 
that can be derived from Table 2 using the definition in [14]. The else clause in 
Figure 1 indicates that events not explicitly named in the table do not change the 
value of the variable being defined. To make the set of test sequences complete, 
we need to construct test sequences not only for the cases described explicitly 
in the table but for these “no-change” cases as well. 

if 

□ Pressure = TooLow 

A @T(WaterPres > Low) 

□ Pressure = Permitted 

A @T(WaterPres > Permit) 

□ Pressure = Permitted 

A @T(WaterPres < Low) 

□ Pressure = High 

A @T(WaterPres < Permit) 

□ (else) 
fi 

Fig. 1. Function Defining Pressure With a Single else Clause. 

To produce an interesting suite of test sequences for the no-change cases, we 
replace the definition of the mode class Pressure in Figure 1 with the equivalent 
definition in Figure 2, which associates an else clause with each possible value of 
Pressure. In Figure 2, the no-changes cases are labeled C2, C5, and C7. In each 
case, the value of Pressure does not change. Further, in each case, either the 
monitored variable WaterPres changes (in a way that satisfies the constraints 



-> Pressure 

-> Pressure 

-> Pressure 

-> Pressure 
-> Pressure 



= Permitted 

= High 

= TooLow 

= Permitted 
= Pressure 
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if 

□ Pressure = TooLow 

if 

□ @T(WaterPres > Low) -> Pressure = Permitted Cl 
n (else) -> Pressure = Pressure C2 

fi 

□ Pressure = Permitted 

if 

n @T(WaterPres > Permit) -> Pressure = High C3 

n @T(WaterPres < Low) -> Pressure = TooLow C) 

n (else) -> Pressure = Pressure C5 

fi 

□ Pressure = High 

if 

n @T(WaterPres < Permit) -> Pressure = Permitted C6 

n (else) -> Pressure = Pressure C7 

fi 
fi 

Fig. 2. Function Defining Pressure With One else Clause per Mode. 

of that case) or WaterPres does not change but another monitored variable (in 
this example, cither Block or Reset) changes. In our approach, each of the Ci 
labels a “branch,” i.e., an equivalence class of state transitions. Together, these 
branches cover the interesting state transitions, i.e., those that change the value 
of the variable that the table defines and those that do not. Our approach is 
to construct one or more test sequences for each branch. For example, for the 
branch labeled Cl, one or more test sequences will be constructed that satisfy 
both the hypothesis and the conclusion of the property 

Pressure = TooLow A @T (WaterPres > Low) Pressure = Permitted. 

To translate from the tabular format into Promela, the language of Spin, we 
apply the translation method described in [3]. Figure 3 shows the translation into 
Promela of the two branches labeled Cl and C2 in Figure 2. Because Promela 
does not allow expressions containing both “current” and “primed” values of 
variables, two Promela variables are assigned to each SCR variable. In Figure 3, 
each variable with a suffix of “P” represents a primed variable. We also translate 
each event in an event table to a Promela if statement. Thus, the event in the 
first row of Table 2 (labeled Cl in Figure 2) is translated to the if statement 
in lines 4-5 of the Promela code in Figure 3. In addition, we translate each else 
clause to a corresponding else statement in Promela. Thus, the else clause in the 
branch labeled C2 in Figure 2 is translated to a corresponding else statement 
in the sixth line of the Promela code in Figure 3. 

To label the different cases® in the Promela code, we introduce an auxil- 
iary integer-valued variable Case?;ar, where var names the variable defined by 
the table. To indicate the case corresponding to Ci, we assign Casevar the 
value i.® In this manner, we ensure that every branch assigns Caseiiar a dif- 

® Henceforth, this paper uses the terms “branch” and “case” interchangeably. 

® The introduction of an auxiliary variable is unnecessary and used here solely for clar- 
ity and generality. Our method works just as well without the use of new variables. 
For example, we may use Promela’s goto statement or similar alternatives. 
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if 

: : (Pressure == TooLow) -> 
if 

: : (WaterPresP >= Low) && ! WaterPres >= Low 

-> PressureP = Permitted; CasePressure = 1; 
:: else CasePressure = 2; 

fi 

fi 

Fig. 3. Promela Code for Cases Cl and C2. 

ferent value. Thus, in Figure 3, which shows two branches, the hrst branch is 
labeled CasePressure = 1 and the second CasePressure = 2. 

Next, we use Spin to construct one or more test sequences for each branch. 
To that end, we define a trap property which violates the predicate in the case 
statement. To construct a test sequence that corresponds to CasePressure = 1 
in Figure 3, we construct the negation of CasePressure = 1 and insert it into a 
Promela assert statement: 

assert (CasePressure != 1). 

Then, any trace violating this trap property is a trace which satisfies the case 
(i.e., branch) with which CasePressure = 1 is associated. 

When Spin analyzes the SIS specification for the above trap property, it 
produces, as expected, a counterexample. From this counterexample, our method 
derives the test sequence of length 20 shown in Table 3. As required, the last 
two states of the test sequence satisfy the predicate labeled Cl in Figure 2. 
That is, the sequence concludes with two states (,s, .s ) such that, in state s, 
WaterPres ^ Low and Pressure is TooLow (implied by WaterPres = 9 at step 
19) and, in state s , WaterPres >Low and Pressure is Permitted (implied by 
WaterPres equals 10 at step 20). Moreover, as required by the input model, 
the value of WaterPres from one state to the next never exceeds three psi. One 
problem with this test sequence is its excessive length. Section 5 discusses this 
problem and shows how a much shorter test sequence may be built using SMV. 



Step 

No. 


Monitored Var. 
Value 


Controlled Var. 
Value 


Step 

No. 


Monitored Var. 
Value 


Controlled Var. 
Value 


1 


Block On 




11 


WaterPres 4 




2 


Reset Off 




12 


Block On 


Safetyinjection Off 


3 


Block Off 




13 


Block Off 




4 


Block On 


Safetyinjection Off 


14 


Reset On 


Safetyinjection On 


5 


Block Off 




15 


Block On 




6 


WaterPres 3 




16 


Reset Off 




7 


Block On 




17 


WaterPres 5 




8 


Reset On 


Safetyinjection On 


18 


WaterPres 6 




9 


Block Off 




19 


WaterPres 9 




10 


Reset Off 




20 


WaterPres 10 


Safetyinjection Off 



Table 3. Test Sequence Derived from Spin Counterexample for Case Cl of Fig. 2. 
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Mode 


Conditions 


TooLow 


Overridden 


NOT Overridden 


Permitted, High 


True 


False 


Saf etyinjection 


Off 


On 



Table 4. Condition Table defining Saf etyinjection. 



Generating Test Sequences from a Condition Table. Test sets are gener- 
ated from condition tables in a similar manner. For example, consider the con- 
dition table in Table 4, which defines the controlled variable Saf etyinjection. 
Table 4 states, “If Pressure is TooLow and Overridden is true, or if Pressure is 
Permitted or High, then Safety Injection is Off; if Pressure is TooLow and 
Overridden is false, then Safety Injection is On.” The entry “False” in the 
rightmost column means that Safety Injection is never On when Pressure is 
Permitted or High. This table generates the four cases. Cl, C2, CS, and Cf, 
shown in Figure 4. Because condition tables explicitly define total functions (they 
never contain implicit no-change cases), there is no need to generate additional 
branches containing else clauses. Note that the the two modes in the second row 
of Table 4 generate two different cases, C3 and Cf. 



if 

n Pressure = TooLow 
if 

n Overridden = true -> Saf etyinjection = Off Cl 
□ Overridden = false -> Saf etyinjection = On C2 
fi 

□ Pressure = Permitted -> Saf etyinjection = Off C3 
n Pressure = High -> Saf etyinjection = Off Cf 



Fig. 4. Function Defining Saf etyinjection. 



4.3 Branch Coverage 

Associated with the method described in Section 4.2 is a precise, well-defined 
notion of test coverage. Our method assures branch coverage by observing the 
following rules: 

1. In each condition table, every condition not equivalent to false is tested at 
least once. 

2. In each event table, every event is tested at least once. 

3. In each event table, in eaeh mode, every no-change case is tested at least 
once. 

For example, applying the first rule to the condition table in Table 4 generates 
a minimum of four test sequences, one each for the two conditions shown when 
Pressure is TooLow and one each for the two modes. Permitted and High, which 
guarantee that Saf etyinjection is off. 
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The third rule, which addresses the no-change cases, is very important, es- 
pecially for high assurance systems, where very high confidence is needed that 
a variable changes when it should change but does not change when it should 
not change. We found the coverage provided by the third rule too weak because 
many input events do not change any dependent variable. To achieve greater cov- 
erage, we modified our method so that it can generate test sequences satisfying 
a stronger rule, i.e., 

3 . In each event table, in each mode, a change in each monitored variable which 
does not change the value of the variable that the table defines is tested at 
least once. 

For example, consider the no-change case C2 for the function defining the mode 
class Pressure (see Figure 2). For this case, our method would generate three 
test sequences, one corresponding to a change in each of the monitored variables 
Block, Reset, and WaterPres. (Of course, WaterPres could only change to some 
other value below the constant Low.) Similarly, three test sequences would also 
be constructed for each of the other no-change cases, C5 and G7. Hence, for this 
event table, our method would construct 13 test sequences. 

We obtain coverage for the event table defining Pressure because the modes 
TooLow, Permitted, and High cover the state space. In many event tables, how- 
ever, not all of the modes are mentioned explicitly because, in some modes, no 
event changes the value of the variable that the table defines. In such situations, 
the no-change cases that involve the missing modes are covered by appending 
an additional else clause to the function definition derived from the table. 

5 A Tool for Automatically Generating Test Sequences 

We have developed a tool in Java that uses a model checker to construct a suite 
of test sequences from an SCR requirements specification. To achieve this, the 
tool automatically translates the SCR specification into the language of either 
SMV or Spin, constructs the different cases, executes the model checker on each 
case and analyzes the results, derives the test sequences, and writes each test 
sequence into a file. For each case that it processes, the tool checks whether 
the case is already covered by one or more existing test sequences. If so, it 
proceeds to the next case. If not, the tool runs the model checker and transforms 
the coTinterexample generated by the model checker into a test seqTience. As it 
processes cases, the tool sometimes finds that a new test sequence t 2 covers all 
cases associated with a previously computed test sequence ti. In this situation, 
the test sequence ti is discarded because it is no longer useful. 

Because many software errors occur at data boundaries, we designed a tool 
option that causes extra test sequences to be constructed at data boundaries. 
Turning on this comparison-split option causes the tool to split a branch contain- 
ing the relation x > y {x and y are integers) into two branches, one containing 
X > y and the other containing x = y. Similarly, this option splits a branch 
containing x > y into two branches, one containing x = y + 1 and the second 
containing x > y + 1. When this option is selected, the tool will generate two 
test sequences to test the given relation rather than one alone. 
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Specif. 


No. of 
Vars 


No. of 
Branches 


Total Test Seq. 


Useful Test Seq. 


Exec. Time 


Total Steps 


Spin 


SMV 


Spin 


SMV 


Spin 


SMV 


Spin 


SMV 


Small SIS 


6 


33 


7 


14 


5 


11 


73s 


3.7s 


280 


62 


Large SIS 


6 


33 


12 


14 


3 


11 


165s 


4099s 


10,529 


778 


Cruise Cont. 


5 


27 


19 


23 


7 


15 


100s 


5.1s 


245 


66 


WCPl 


55 


50 


15 


- 


10 


- 


500s 


00 


4795 


- 



Table 5. Automatic Generation of Test Sequences Using Spin and SMV. 

5.1 Experimental Results 

This section describes the results of applying our tool to four specifications: 
the small SIS specification described in Section 4; a larger SIS specification; 
the small Cruise Control specification in [18]; and the WCPl specification, a 
mathematically sound abstraction of a large contractor specification of a real 
system [12], The larger SIS is identical to the small SIS, except Low is 900, 
Permit is 1000, and WaterPres ranges between 0 and 2000 and changes by no 
more than 10 psi per step. The purpose of applying our test generation tool to the 
WCPl specification was to evaluate our method on a large, realistic specification. 
The WCPl specification is a sizable component of a contractor specification in 
which five real-valued variables have been replaced by five integer variables. “ 
(Model checking the original contractor specification is infeasible because its 
state space is infinite.) The reduced specification is still quite large, containing 
55 variables — 20 monitored variables, 34 auxiliary variables, and one controlled 
variable. In processing all four specifications, the comparison-split option was olf. 
In generating test sequences for the three smaller specifications, we applied rule 
3 from Section 4.3 to obtain wider coverage. Due to the large size of the WCPl 
specification and consequently the long execution time that we anticipated, in 
generating test sequences for WCPl, we applied rule 3, which is weaker. 

Tables 5 and 6 summarize some results from our experiments. In Table 5, 
No. of Vars gives the total number of variables in each specification and No. of 
Branches the total number of branches. Total Test Seq. gives the total number 
of test sequences generated and Useful Test Seq. the number of test sequences 
remaining after weaker test sequences (those covered by other test sequences) are 
discarded, Exec. Time indicates the total seconds required to construct the test 
sequences, and Total Steps describes the total number of input events the tool 
processed in constructing the test sequences. For both Spin and SMV, Table 6 
shows the number of “unreachable” cases the tool encountered (see below) and 
the lengths of the useful test sets generated for each specification. 

5.2 Spin vs. SMV 

Because their approaches to model checking are significantly different. Spin and 
SMV produced very different results, both in the test sequences generated and in 
their efficiency on large examples. As Table 6 shows, the main problem with Spin 

® Although the test sequences generated from the WCPl specification contain abstract 
versions (i.e., discrete versions) of the real- valued variables, translating each abstract 
variable to one or more real-valued variables, while tedious, is straightforward. We 
have developed, and plan to implement, an algorithm that automatically transforms 
an abstract test sequence into a concrete test sequence. 
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Specif. 


1 Spin-Generated Test Sequences 


1 SMV-Generated Test Sequences | 




Useful 


Unreach. 


Lengths 


Useful 


Unreach. 


Lengths 


Small SIS 


5 


1 


1, 20, 54, 99, 106 


11 


1 


2, 3(2), 5(2), 6(2), 8(4) 


Large SIS 


3 


1 


3082, 3908, 3539 


11 


1 


2, 3(2), 91(2), 101(4) 


Cruise Cont. 


7 


7 


31(2), 35, 37(4) 


15 


7 


2(3), 3(3), 5(3), 6(6) 


WCPl 


10 


2? 


2(2), 3, 30, 72, 104 
895, 1086, 1292, 1309| 









Table 6. Unreachable Cases and Test Sequence Lengths for Four Specifications. 

is the length of the test sequences it generates. Because Spin does a depth-hrst 
search of the state-machine model, it produces very long counterexamples (for 
us, very long test sequences). Although we applied the Spin switch which finds 
the shortest counterexample, this approach rarely (if ever) found a test sequence 
with the shortest possible length. 

This led us to experiment with SMV, which produces the shortest possible 
counterexamples because its search is breadth-first. To illustrate the results of 
generating test sequences using counterexample generation in SMV, we recon- 
sider the branch labeled Cl in Figure 2 (see Section 4.2). Table 7 shows the test 
sequence of length 3 that our tool derived from the counterexample generated 
by SMV. Clearly, this test sequence is a cheaper way to test a software imple- 
mentation of SIS for the behavior described in case Cl than the test sequence 
of length 20 shown in Table 3. 



Step 

No. 


Monitored Var. 
Value 


Controlled Var. 
Value 


1 


WaterPres 5 




2 


WaterPres 8 




3 


WaterPres 10 


Safetyinjection Off 



Table 7. Test Sequence Derived from SMV Counterexample for Case Cl of Fig. 2. 

For each of the three specifications for which it produced results, using SMV 
to construct the test sequences dramatically reduced the length of the test se- 
quences. However, SMV also produced many more test sequences than Spin. For 
example, in analyzing the smaller SIS specification (see Table 6), SMV produced 
11 useful test sequences ranging in length from 2 to 8, whereas Spin generated 
five useful test sequences of lengths 1, 20, 54, 99, and 106. 

As Tables 5 and 6 show, not only did SMV generate shorter counterexam- 
ples, in addition, for small examples, SMV was faster than Spin. However, for 
large examples, SMV required long eomputation times, whereas Spin was gen- 
erally faster and covered the entire specification with fewer, but very long, test 
sequences. In the case of WCPl, SMV ran out of memory before it generated 
any test sequences (indicated by oo in Table 5). In contrast, Spin generated test 
sequences for every specification. 

The reason for these difference lies in the different approaches to model check- 
ing taken by Spin and SMV. Spin uses explicit state enumeration to verify prop- 
erties, i.e. computes the set of reachable states by enumeration (i.e. “running the 
model”), whereas SMV represents the reachable states symbolically as a BDD 
formula. Because the number of reachable states in requirements specifications 
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is usually far fewer than the number of possible states and because the BDD 
formula computed by SMV becomes enormous when it includes constraints on 
input variables, Spin often does better than SMV on large SCR specifications, 
especially in finding a state where the property is false [3]. 

5.3 “Unreachable” States 

In generating test sequences for the three smaller specifications, our tool exposed 
several cases that involved “unreachable” states (see Table 6). For example, the 
tool found one unreachable state in each of the SIS specifications. In each, the 
model checker tried to find a trace in which the mode Pressure made a transition 
from TooLow to High. In both specifications, such a transition is impossible given 
the constraints on changes in WaterPres and the values of the constants Low and 
Permit. Similarly, all seven of the unreachable cases shown in Table 6 for the 
Cruise Control specification also involve impossible transitions; for example, a 
transition in which IgnOn changes and the mode class CruiseControl remains 
the same is easily shown to be impossible, using the invariants in [18], 

For large specifications, a model checker sometimes runs out of memory (or 
time) before it finds a counterexample. In this situation, our method cannot 
detect whether the case is unreachable or is simply too complex to be analyzed 
by the model checker with the available memory. When this situation occurs, our 
tool identifies the case that the method failed to cover, so that the designer can 
consider the problem and, if necessary, build the test sequence by hand. This 
situation occurred when the tool was model checking the WCPl specification 
with Spin and ran out of memory before it had generated test sequences for the 
two “unreachable” cases listed in Table 6. Our suspicion is that these two test 
sequences do not involve impossible transitions. Instead, the large size of the 
WCPl specihcation probably caused Spin to run out of memory before it found 
traces for these two cases. 

5.4 Role of Abstraction 

Although the WCPl specification has many more variables and is much more 
complicated than the larger SIS specification, Spin required many fewer input 
events in generating test sequences for the WCPl specification than in gener- 
ating test sequences for the larger SIS specification. In our view, the reason is 
that abstraction was applied efiectively to the WCPl specification; in contrast, 
no abstraction was applied to the SIS specification. In processing specifications. 
Spin usually changes every input variable in the specification, not only the “in- 
teresting” input variables, many times. By eliminating the uninteresting input 
variables using abstraction, it should be possible to decrease the length of the 
test sequences that Spin produces. 

5.5 Effective Use of Model Checking 

Although model checking may be used to verify properties of specifications, the 
enormous state space of finite-state models of practical software specifications 
often leads to the state explosion problem’, the model checker runs out of memory 
or time before it can analyze the complete state space. This occurs even when 
partial order and other methods for reducing the state space are applied. Model 
checking is thus usually more effective in detecting errors and generating coun- 
terexamples than in verification [11,3]. Hence, we are using a model checker in 
the most effective way. 
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6 Related Work 

At least two groups [22,23] have formulated a formal framework where both test- 
ing criteria and test oracles are formally defined. In this approach programmers 
use the same model both to specify programs and to construct test sequences. 
While both groups present guidelines for constructing test sequences, unlike us, 
they do not describe a concrete method for automatically generating the test 
sequences. 

Recently, three other groups have used a model checker to generate test 
cases complete with output values [5,8,1]. Callahan and his colleagues use a 
process representing the specification to examine traces generated by a process 
simulating the program. In this manner, they detect and analyze discrepancies 
between a software implementation and the specification (a set of properties). 
They use Spin as an oracle to compute the system outputs from the process 
representing the specification. Engels et al. also describe the use of Spin to 
generate test sequences [8]. They assume the designer has dchned the “testing 
purpose,” analogous to our trap properties. A major weakness is the reliance of 
their method on a manual translation of the specification to Spin, which requires 
some skill and ingenuity. Because both methods use properties to construct the 
test sequences, they suffer the weaknesses of property-based methods described 
in Section 4.1. Ammann, Black, and Majurski have proposed a novel approach 
based on mutations, which uses SMV to generate test sequences [1]. By applying 
mutations to both the specification and the properties, they obtain a large set 
of test sequences, some of which describe correct system executions and others 
describing incorrect system executions. Using this method, a correct software 
iniplenientation should pass tests that describe correct executions and fail tests 
for incorrect executions. However, this method lacks the systematic treatment 
of the no-change cases provided by our method. 

Blackburn et al. [4] describe a different method for generating test sequences 
from SCR specifications, which does not use a model checker. In their method, 
a tool called T-VEC derives a set of test vectors from SCR specifications. In 
this vector-oriented approach, each test sequence is simply a prestate/poststate 
pair of system inputs and outputs. Although this method has proven useful for 
testing software modules, its use in black-box software testing is problematic, 
because it does not provide a valid sequence of inputs leading to each pair of 
state vectors. 

7 Summary and Plans 

This paper has described an original method for automatic generation of test 
sequences from an operational requirements specification using a model checker. 
The method has several desirable features: 

— It uses an operational requirements specification both to construct a valid 
sequence of system inputs and to compute the expected system outputs from 
the input sequence. 

— It generates a suite of test sequences that cover the state transitions by 
generating test sequences from cases explicitly described in the specification 
and from cases that are implicit in the specification (the “no-change” cases) . 
These sequences test the most critical input sequences, those that should 
change the system state (as specified in the event tables) and those that 
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should not change the system state. Every condition in a condition table is 
also tested. 

— It may be applied using either the explicit state model checker Spin or the 
symbolic model checker SMV. 

To illustrate the utility of our approach, we showed how a tool that implements 
our method can generate test sequences for some small examples and for a large 
component of a contractor specification of a real-world system. These early re- 
sults demonstrate both the method’s potential efficiency and its practical utility. 

A number of other important issues remain. First, we plan to experiment 
with abstraction to address both the state explosion problem encountered with 
SMV and the long input sequences produced by Spin. Given the effectiveness of 
combining mathematically sound, automated abstraction methods with model 
checking for detecting specification errors [12], we are optimistic that abstraction 
can also prove effective in making automatic test sequence generation from large 
specifications practical. Second, we will study alternative methods for selecting 
test sequences for a given branch. Our current method usually constructs a single 
test sequence for each branch of a function definition. An important question 
is how to select a collection of test sequences that are adequate for testing the 
behavior specified by that branch. One alternative which could prove useful when 
a large number of variable values exist (e.g., large ranges of numerical values) 
is to use a statistical method to select test sequences. Another alternative is 
to select test sequences systematically by further case splitting. To determine 
which particular test sequences to select, a method like that of Weyuker and her 
colleagues [24] may be useful. Finally, we plan to use the suite of test sequences 
that our tool generates from a given SCR requirements specification to test a 
real software implementation. 
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Abstract. Specification of software for safety critical, embedded com- 
puter systems has been widely addressed in literature. To achieve the 
high level of confidence in a specification’s correctness necessary in many 
applications, manual inspections, formal verification, and simulation must 
be used in concert. Researchers have successfully addressed issues in in- 
spection and verification; however, results in the areas of execution and 
simulation of specifications have not made as large an impact as desired. 

In this paper we present an approach to specification-based prototyping 
which addresses this issue. It combines the advantages of rigorous for- 
mal specifications and rapid systems prototyping. The approach lets us 
refine a formal executable model of the system requirements to a de- 
tailed model of the software requirements. Throughout this refinement 
process, the specification is used as a prototype of the proposed software. 
Thus, we guarantee that the formal specification of the system is always 
consistent with the observed behavior of the prototype. The approach is 
supported with the NiMBUS environment, a framework that allows the 
formal specification to execute while interacting with software models 
of its embedding environment or even the physical environment itself 
(hardware-in-the-loop simulation) . 



1 Introduction 

Validating software specifications for embedded systems presents particularly 
difficult problems. The software’s correctness cannot be determined without con- 
sidering its intended operating environment. In these systems, the software must 
interact with a variety of analog and digital components and be able to detect 
and recover from error conditions in the environment. In addition, the software 
is often subject to rigorous safety and performance constraints. 

Assurance that the software specification possesses desired properties can 
be achieved through (1) manual inspections, (2) formal verification, or (3) sim- 
ulation and testing. To achieve the high level of confidence in the correctness 
required in many of today’s critical embedded systems, all three approaches must 
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be used in concert. This paper addresses the some of the capabilities of Nimbus, 
our environinent for embedded systems development. 

The capability to dynamically analyze, or execute, the description of a soft- 
ware system early in a project has many advantages: it helps the analyst to 
evaluate and address poorly understood aspects of a design, improves commu- 
nication between the different parties involved in development, allows empirical 
evaluation of design alternatives, and is one of the more feasible ways of validat- 
ing a system’s behavior. 

Rapid prototyping is one way of achieving executability for systems. It has 
been successful when specialized tools and languages are applied to specific, 
well defined domains. For example, prototyping of user interfaces and database 
systems has been highly successful. Languages in these domains, such as Power- 
Builder and Visual Basic, provide powerful, high-level features and have achieved 
great success in industry. In the embedded systems domain, however, rapid pro- 
totyping does not appear to have been as successful. 

In this paper, we focus on an approach to simulation and debugging of formal 
specifications for embedded systems called specification-based prototyping [4]. 
Within the context of specification execution and simulation, specification-based 
prototyping combines the advantages of traditional formal specifications (e.g., 
unambiguity and analyzability) with the advantages of rapid prototyping (e.g., 
risk management and early end-user involvement). The approach lets us refine a 
formal executable model of the system requirements to a detailed model of the 
software requirements. Throughout this refinement process, the specification is 
used as an early prototype of the proposed software. By using the specification 
as the prototype, most of the problems that plague traditional code-based pro- 
totyping disappear. First, the formal specification will always be consistent with 
the behavior of the prototype (excluding real-time response) and the specifica- 
tion is, by definition, updated as the prototype evolves. Second, the capability to 
evolve the prototype into a production system is largely eliminated. Finally, the 
dynamic evaluation of the prototype can be augmented with formal analysis. 

To enable specification-based prototyping, we have developed the Nimbus re- 
quirements engineering environment. Nimbus, among other things, allows us to 
dynamically evaluate an RSML (Requirements State Machine Language) speci- 
fication while interacting with (l) user input or text file input scripts, (2) RSML 
models of the components in the embedding environment, (3) software simu- 
lations of the components, or (4) the physical components themselves. When 
starting to develop Nimbus, we identified the following fundamental properties 
such an environment must posses. First, it must support the execution of the 
specification while interacting with accurate models of the components in the 
surrounding environment, be that RSML specifications, numerical simulations, 
statistical models, or physical hardware. Second, the environment must allow an 
analyst to easily modify and interchange the models of the components. Third, 
as the specification is being refined to a design and finally production code, there 
should not be any large conceptual leaps in the way in which the control software 
communicates with the embedding environment. 

In the next section, we provide a short overview of some related approaches 
to prototyping and executable specifications. Section 3 presents a small example 
and Section 4 outlines RSML and the Nimbus environment. In Section 5 we 
discuss how NIMBUS is used to evaluate and refine a systems requirements model 
to a software requirements model. Section 6 concludes. 
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2 Related Work 

2.1 Executable Specification Languages 

An executable specification language is a formally well defined, very high-level 
specialized programming language. Most executable specification languages are 
intended to play many roles in the software development process. For instance, 
languages such as PAISLey [26], ASLAN [3], and REFINE [1] are intended to 
replace requirements speciheations, design speciheations, and, in some instances, 
implementation code. Executable specification languages have achieved some 
success and have been applied to industrial size projects. Many languages have 
elaborate tool support and facilitate refinement of a high level specification into 
more detailed design descriptions or implementation code. 

Nevertheless, current executable specification languages have several draw- 
backs. Most importantly, the syntax and semantics arc close to traditional pro- 
gramming languages. Therefore, currently they do not provide the level of ab- 
straction and readability necessary for a requirements notation [8,9]. 

Notable exceptions to the languages discussed above are a collection of state- 
based notations. Statecharts [10,11], SCR (Software Cost Reduction) [15,16], and 
the RSML [19], are very high-level and provide excellent support for inspections 
since they are relatively easy to use and understand for all stake holders in a 
speciheation effort. These languages allow automated verification of properties 
such as completeness and consistency [12,15], and efforts are underway to model 
check state-based speciheations of large software systems [2,5]. 

Even so, none of the tools supporting these languages met the requirements 
that we established for a prototyping environment. SCR and the original RSML 
tool did not allow as flexible nor as easy an integration of component models as 
we desired. Staternate provides the ability to integrate with other tools; however, 
this integration is achieved via a complex simulation backplane. This differs from 
the goals that wc had for the NiMBUS environment: we wanted a framework in 
which it would be easy to integrate many different models of the components in 
the environment and which used a simple model of inter-component interaction, 
not a complex co-simulation tool. 



2.2 Prototyping 

There are two main approaches to prototyping. One approach is to develop a 
draft iniplcmentation to learn more about the requirements, throw the prototype 
away, and then develop production quality code based on the experiences from 
the prototyping effort. The other approach is to develop a high quality system 
from the start and then evolve the prototype over time. Unfortunately, there are 
problems with both approaches. 

The most common problem with throw away prototyping is managerial, many 
projects start developing a throw away prototype that is later, in a futile attempt 
to save time, evolved and delivered as a production system. This misuse of a 
throw-away prototype inevitably leads to unstructured and difficult to maintain 
systems. 

Dedicated prototyping languages have been developed to support evolution- 
ary prototyping [18,23]. These languages simplify the prototyping effort by sup- 
porting execution of partial models and providing default behavior for under- 
specified parts of the software. Although prototyping languages have achieved 
some initial success, it is not clear that they provide significant advantages over 
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traditional high-level programming languages. Evolutionary prototyping often 
lead to unstructured and difficult to maintain systems. Furthermore, incremen- 
tal changes to the prototype may not be captured in the requirements specihca- 
tion and design documentation which leads to inconsistent documentation and 
a maintenance nightmare. 

Software prototypes have been successfully used for certain classes of systems, 
for example, human-machine interfaces and information systems. However, their 
success in embedded systems development has been limited [6]. Clearly, a dis- 
cussion of every other prototyping technique is beyond the scope of this paper. 
Nevertheless, most work in prototyping is, in our opinion, too close to design and 
implementation or is not suitable to the problem domain of embedded safety- 
critical systems. 

Notable examples of work in prototyping include PSDL [18,22] and Rapide 
[20,21]. PSDL is based on having a reusable library of Ada modules which can be 
used to animate the prototype. Nevertheless, it seems that this approach would 
preclude execution until a fairly detailed specification was developed. Rapide is 
a useful prototyping system, but it does not have the capability to integrate as 
easily with other tools that desired. In addition, Rapide’s scope is too broad for 
our needs; we wanted a tool-sct that was focused on the challenges presented by 
embedded systems. 



3 The Altitude Switch 

This section describes a simple example drawn from the the avionics domain: 
the Altitude Switch (ASW). The ASW is a hypothetical device that turns power 
on to another subsystem when the aircraft descends below a threshold altitude. 
While the ASW appears almost trivial, it raises a surprising number of issues, 
particularly regarding how it interacts with its environment. 




Fig. 1. The ASW system in its environment 



The ASW and its environment are shown in Figure 1. The ASW receives 
altitude information from an analog radio altimeter and two digital radio al- 
timeters, with the altitude taken as the lowest valid altitude seen. If the altitude 
cannot be determined for more than two seconds, the ASW indicates a fault by 
failing to strobe a watchdog timer. A fault is also indicated if internal failures 
are detected in the ASW. The detection of a fault turns on an indicator lamp 
within the cockpit. 
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The environment in which the ASW operates includes several features which 
make the system more interesting. First, the ASW software does not have com- 
plete control over the DOI. The DOI can be turned on or off at any time by 
other devices on the aircraft. Second, the functioning of the ASW can be in- 
hibited or reset at any time. This raises questions, for example, about how the 
ASW should operate if it is reset while below the threshold altitude. Finally, 
the ASW must interface with three different altimeters; furthermore, the analog 
and digital altimeters are significantly different in terms of the information that 
they provide. 

The next section introduces the RSML language by providing an overview of 
the high-level ASW requirements specification. 



4 RSML and the Nimbus Environment 

RSML was developed as a requirements specification language for embedded sys- 
tems and is based on David Harel’s Statecharts [10] . One of the main design goals 
of RSML was readability and understandability by non-computer professionals 
such as users, engineers in the application domain, managers, and representatives 
from regulatory agencies [19]. 



4.1 Introduction to RSML 

An RSML specification consists of a collection of variables, states, transitions, 
functions, macros, constants, and interfaces. 

Variables in the specification allow the analyst to record the values reported 
by various external sensors (in the case of input variable) and provide a place 
to capture the values of the outputs of the system prior to sending them out in 
a message (in the case of output variables). Figure 2 shows the input variable 
definition for the Altitude in the ASW requirements. 



Invariable Altitude : Numeric 
Units : ft 
Initial Value : 0 
Expected Min : 0 
Expected Max : 40,000 

Fig. 2. An Input Variable definition from the ASW requirements 



States are organized in a hierarchical fashion as in Statecharts. RSML in- 
cludes three different types of states - compound states, parallel states, and 
atomic states. Atomic states are analogous to those in traditional finite state 
machines. Parallel states are used to represent the inherently parallel or con- 
current parts of the system being modeled. Finally, compound states are used 
both to hide the detail of certain parts of the state machine so as to make the 
resulting model easier to comprehend and to encapsulate certain behaviors in 
the machine. 

The state hierarchy modeling the high level ASW requirements could be 
represented as in Figure 3. This representation includes all three types of states. 
Fully Operational is a parallel state with three direct children. All of these are 
compound states which contain only atomic states {Above, Below, etc.). 
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Fig. 3. Hi g h-level ASW Model 

Transitions in RSML control the way in which the state machine can move 
from one state to another. A transition consists of a source state, a destination 
state, a trigger event, a guarding condition, and a set of action events that is 
produced when the transition is taken. In order to take an RSML transition, 
the following must be true: (1) the source state must be currently active, (2) 
the trigger event must occur while the source state is active, and (3) when the 
trigger event occurs, the guarding condition must evaluate to true. If all of these 
conditions are satished then the destination state will become active, the source 
state will become inactive, and the action events will be produced. 

The guarding condition is simply a predicate logic expression over the various 
states and variables in the specification; however, during the TCAS project, the 
team that developed RSML (the Irvine Safety Research Group led by Dr. Nancy 
Leveson) discovered that the guarding conditions required to accurately capture 
the requirements were often complex. The prepositional logic notation tradition- 
ally used to define these conditions did not scale well to complex expressions and 
quickly became unreadable. To overcome this problem, they decided to use a tab- 
ular representation of disjunctive normal form (DNF) that they called ANd/or 
tables. The far-left column of the and/or table lists the logical phrases. Each 
of the other columns is a conjunction of those phrases and contains the logical 
values of the expressions. If one of the columns is true, then the table evaluates 
to true. A column evaluates to true if all of its elements match the truth values 
of the associated predicates. A dot denotes “don’t care.” Figure 4 (a) shows a 
transition from the high-level ASW requirements. 

To further increase the readability of the specification, the Irvine Group in- 
troduced many other syntactic conventions in RSML. For example, they allow 
expressions used in the predicates to be defined as simple case-statement func- 
tions and familiar and frequently used conditions to be defined as macros. In 
Figure 4 (a), “BelowThreshold” and “AltitudeQualityOK” are both macros. The 
definition of the BelowThreshold macro is given in Figure 4 (b). A more refined 
(and more complex) version with several columns can be found in Figure 12. 



4.2 Inter-component Communication 

There should be a clear distinction between the inputs to a component, the out- 
puts from a component, and the internal state of the component. Every data 
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Macro: BelowThreshold 

Definition: 

Location: AltitudeStatus 
Trigger Event: AltReceivedEvent 

Condition: | Altitude < AltitudeThreshold | | T | 



BelowThreshold() 




T 


AltitudeQualityOK() 




T 



Transition(s): 



I Above 



Below 



Output Action: AltStatusEvaluatedE- 
vent 

(a) (b) 

Fig. 4. A Transition and Macro from the ASW requirements 

item entering and leaving a component is defined by the input and output vari- 
ables. The state machine can use both input and output variables when defining 
the transitions between the states in the state machine. However, the input 
variables represent direct input to the component and can only be set when re- 
ceiving the information from the environment. The output variables can be set 
by the state machine and presented to the environment through output inter- 
faces. 

RSML supports rigorous specification and analysis of system level inter- 
component communication. Communication in the framework occurs through 
simple messages consisting of a number of numeric or enumerated fields. Com- 
ponents in the system arc connected via channels. Each component can have any 
number of incoming and/or outgoing channels. The formality of the specification 
allows us to automatically verify a specification for a number of simple safety 
and liveness constraints. For a more detailed description of the communication 
definitions and analysis procedures, the reader is referred to [13]. 

The Nimbus environment is based on the ideas that (1) the engineers would 
like to have an executable specification of the system early in the project and (2) 
that as the specification is refined it is desirable to integrate it with more detailed 
models of the environment. Therefore, in the initial stages of the project, we want 
the executions to take their input from simple models, for example, text files 
or user input. As the specification is refined, the analyst can add more detailed 
models of the sensors and actuators, for example, additional RSML specifications 
or software simulations. In order to have a closed loop simulation, a model of the 
process can be added between the sensor and actuator models. Finally, when the 
specification has been rehned to the point of defining the hardware interfaces, 
the analyst can execute it directly with the hardware. This hardware-in-the-loop 
simulation closes the gap between the prototype and the actual hardware. These 
ideas are illustrated in Figure 5. 



5 Specification-Based Prototyping with Nimbus 

A general view of an embedded control system can be seen in inside square of 
Figure 6. This model consists of the process, sensors, actuators, and the software 
controller. The process is the physical process we are attempting to control. 
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component) to be used. Model 
used is easily changeable. 



Fig. 5. The Nimbus Environment 

The sensors measure physical quantities in the process. These measurements are 
provided as input to the software controller. The controller makes decisions on 
what actions are needed and commands the actuators to manipulate the process. 
The goal of the software control is to maintain some properties in the physical 
process. Thus, understanding how the sensors, actuators, and process behave is 
essential for the development and evaluation of correct software. The importance 
of this systems view has been repeatedly pointed out in the literature [25,19,14]. 

To reason about this type of software controlled systems, David Parnas and 
Jan Madey defined what they call the four-variable model (outside square of 
Figure 6) [25]. In this model, the monitored variables (MON) are physical quan- 
tities we measure in the system and controlled variables (CON) are quantities 
we will control. The requirements on the control system are expressed as a map- 
ping (REQ) from monitored to controlled variables. For instance, a requirement 
may be that “when the aircraft drops below 2,000 ft, a device of interest shall be 
turned on.” Naturally, to implement the control software we must have sensors 
providing the software with measured values of the monitored variables (IN- 
PUT). The sensors transform MON to INPUT through the IN relation; thus, 
the IN relation defines the sensor functions. To adjust the controlled variables, 
the software generates output that activates various actuators that can manipu- 
late the physical process; the actuator function OUT maps OUTPUT to CON. 
The behavior of the software controller is defined by the SOFT relation that 
maps INPUT to OUTPUT. 

The requirements on the control system are expressed with the REQ rela- 
tion; after all, we are interested in maintaining some relationship between the 
quantities in the physical world. To develop the control software, however, we 
are interested in the SOFT relation. Thus, we must somehow refine the sys- 
tem requirements (the REQ mapping) into the software specification (the SOFT 
mapping). The Nimbus environment supports this refinement by allowing a pro- 
gressively more detailed execution of the formal model throughout all stages of 
this refinement process. 
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MON 



REQ 

^ CON 



Monitored Controlled 




Input Output 
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OUT 



INPUT 



SOFT 



^ OUTPUT 



Fig. 6. The four variable model for process control systems 



5.1 Structuring SOFT 

The IN and OUT relations are determined by the sensors and actuators used in 
the system. For example, to measure the altitude we may use a radio altimeter 
providing the measured altitude as an integer value. Similarly, to turn on a device 
a certain code may have to be transmitted over a serial line. Armed with the 
REQ, IN, and OUT relations wc can derive the SOFT relation. The question is, 
how shall we do this and how shall we structure the SOFT relation in a language 
such as RSML? 



IN SOFT OUT 

MON ^ INPUT > OUTPUT — CON 



IN' 



> > 

SOFTreq OUT' 



Fig. 7. The SOFT relation can be split into three composed relations. The SOFT rbq 
relation is based on the original requirements (REQ) relation. 

The system requirements should always be expressed in terms of the physical 
process. These requirements will most likely change over the lifetime of the con- 
troller (or family of similar controllers) . The sensors and actuators arc likely to 
change independently of the requirements as new hardware becomes available or 
the software is used in subtly different operating environments; thus, the REQ, 
IN, and OUT relations arc likely to change. If either one of the REQ, IN, or OUT 
relations change, the SOFT relation must be modified. To provide a smooth tran- 
sition from system requirements (REQ) to software requirements (SOFT) and 
to isolate the impact of requirements, sensor, and actuator changes to a mini- 
mum, Steven Miller at Rockwell Collins has proposed to structure the software 
speciheation SOFT based heavily on the structure of the REQ relation [24]. 

Miller proposed to achieve this by splitting the SOFT relation into three 
pieces, IN“^, OUT“^, and SOFT^^q (F igure 7). IN“^ takes the measured input 








172 



J.M. Thompson, M.P.E. Heimdahl, and S.P. Miller 



and reconstructs an estimate of the physical quantities in MON. The OUT“^ 
relation maps the internal representation of the controlled variables to the output 
needed for the actuators to manipulate the actual controlled variables. Given 
the IN“^ and OUT“^ relations, the SOFT^bq relation will now be essentially 
isomorphic to the REQ relation and, thus, be robust in the face of likely changes 
to the IN and OUT relations (sensor and actuator changes). Such changes would 
only effect the IN“^ and OUT“^ portions of the software specification. 

In the rest of this section we will illustrate how this framework for require- 
ments specification and requirements refinement is used. We will also demon- 
strate how the Nimbus environment supports dynamic evaluation of the various 
models throughout the refinement process. The result of this refinement pro- 
cess will be a formal specification of SOFT that, in Nimbus, serves as a rapid 
prototype. 

5.2 Explore the Requirements (REQ) 

The first step in a requirements modeling project is to define the system bound- 
aries and identify the monitored and controlled variables in the environment. In 
this paper we will not go into the details of how to scope the system requirements 
and identify the monitored and controlled variables. Guidelines to help identify 
monitored and controlled variables are covered in, for example, [24,7,17] 

In the case of the altitude switch, we identified the aircraft altitude as one 
monitored variable and the commands that the ASW sends to the device of in- 
terest as a controlled variable. Both are clearly concepts in the physical world, 
and thus suitable candidates as monitored and controlled variables for the re- 
quirements model. The definition of the Altitude variable can be seen in Figure 2 
and the graphical view of the requirements model is shown in Figure 3. In this 
paper, we will not discuss the details of the requirements model itself; instead, 
we focus on the refinement and execution of such models using Nimbus. 

The Nimbus environment allows us to execute and simulate this model using 
input data representing the monitored variables and collect output representing 
the controlled variables. Input data could come from several sources. The sim- 
plest option for input is, of course, to have the user specify the values (cither 
interactively, or by putting the values into a text hie ahead of time). This sce- 
nario is illustrated in Figure 8(a). Unfortunately, it is often difficult to create 
appropriate input values since the physical characteristics of the environment 
enforce constraints and interrelationships over the monitored variables. Thus, to 
create a valid (i.e., physically realistic) input sequence, the analyst must have a 
model of the environment. Initially, this model may be an informal mental model 
of how the environment operates. As the evaluation process progresses, however, 
a more detailed model is most likely needed. Therefore, in this stage of the mod- 
eling we may develop a simulation of the physical environment. The Nimbus 
architecture lets us easily replace the inputs read from text hies with a software 
simulation emulating the environment. This rehnement can be done without any 
modiheations to the REQ model. For the ASW, we created a spreadsheet in Mi- 
crosoft Excel to emulate the behavior of the aircraft (Figure 8(b)). This simple 
environmental model allows us to interactively modify the ascent and descent 
rates of the aircraft, and easily explore many possible scenarios. 



5.3 Refine REQ to SOFT^^q 

From the start of the modeling effort, we know that we will not be able to directly 
access the monitored and controlled variables-we must use sensors and actuators. 




Specification-Based Prototyping for Embedded Systems 173 




Fig. 8. The REQ relation can be evaluated using text files or user input (a) or inter- 
acting with a simulation of the environment (b). 



Thus, when refining REQ to SOFT, we will not be able to use variables such 
as Altitude. At this early stage, we may not know exactly what hardware will 
be used for sensors and actuators; but, we do know that we must use something 
and we may as well prepare for it. By simply encapsulating the monitored and 
controlled variables we can get a model that is essentially isomorphic to the 
requirements model; the only difference is that this model is more suited for the 
refinement steps that will follow as the surrounding system is completed. 

In our case, using a function. Measured Altitude{) , instead of the monitored 
variable Altitude will shield the specification from possible changes in how the 
altitude measure is delivered to the software. By performing this encapsulation 
for all monitored and controlled variables we refine REQ to SOFTi?£;Q, a map- 
ping from estimates of the monitored variables to an internal representation of 
the controlled variables. 

5.4 IN, OUT, IN-i, and OUT-^ 

As the hardware components of the system are defined (either developed in house 
or procured), the IN and OUT relations can be rigorously specified. The IN and 
OUT models represent our assumptions about how the sensors and actuators 
operate. In the altitude switch we will use one analog and two digital altimeters. 
Thus, we will map the true altitude in the physical world to three software inputs 
(Figure 9). 



Altitude 




DigitalAlt_I 



DigitalAlt_2 



AnalogAlt 



Fig. 9. The true altitude is mapped to three software inputs. 



In the case of the digital altimeter, the altitude will be reported over an 
ARINC-429 low speed bus as a signed integer that represents a fraction of 
8,192 ft. If we ignore inaccuracies introduced in the altitude measure and prob- 
lems caused by the limited resolution of the ARINC-429 word, the transfer func- 
tion for the digital altitude measures can be defined as 



Digital Alt 



Altitude 



8192 
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Fig. 10. The refined model of ASW with models of the three altimeters added. 



The analog altimeter operates in a completely different way. Due to consid- 
erations of cost and simplicity of construction, the analog altimeter does not 
provide an actual altitude value, only a Boolean indication if the measured alti- 
tude is above or below a hardwired threshold (defined to be the same as the one 
required in the altitude switch). Assuming again an ideal measure of the true 
altitude, the transfer function for the analog altimeter could be modeled as 



AnalogAlt 



Above if Altitude > Threshold 
Below if Altitude < Threshold 



In addition, all three altimeters provide an indication regarding the quality of 
the altitude measures. 



Function MeasuredDigitalAlt(Numeric input) : Numeric 
Equals input * DigitalAltMultiplier If True ; 

Comment : DigitalAltMultiplier is defined as a constant 8192 

Fig. 11. The function transforming the input from the digital altimeter to an estimate 
of the true altitude. 



With the information about the sensor (IN) and actuator (OUT) relations, 
we can start refining the SOFT^bq relation towards SOFT. In our case we must 
model, among other things, the three sources of altitude information and fuse 
them to one estimate whether we arc above or below the threshold altitude. To 
achieve this, we refine the IN“^ relation in our model. The refined state machine 
can be seen in Figure 10. Internal models of the perceived state of the sensors 
have been included in the state machine. These new state machines arc used to 
model IN“^. Instead of the idealistic true altitude used when evaluating REQ, 
the specification now takes two digital altitude measures and one analog estimate 
of the altitude as input. The function estimating the true altitude from the 
digital altitude input is shown in Figure 11 and the modified AboveThreshold{) 
macro is shown in Figure 12. Thanks to the structuring of the SOFT relation, 
this refinement could be done with minimal changes to the SOFT^bq relation 
(compare the structure of the state machines in Figure 3 and Figure 10). As the 
components in the environment are developed, this process will be repeated for 
all inputs and outputs until a detailed definition of the SOFT relation is derived. 
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Macro: BelowThreshold 

Definition: 



OR 



MeasuredDigitalAlt(DigitalAlt_l) < AltitudeThreshold 




T 










DigitalAltimeter_l_OK() 




T 










MeasuredDigitalAlt(DigitalAlt_2) < AltitudeThreshold 








T 






Digital Alt imeter_2 -OK ( ) 








T 






AnalogAltitudeMeasure() = Below 












T 


Analog Alt imeter -0 K ( ) 












~Y 



Fig. 12. Macro modified to handle the tree inputs instead of the true altitude as it did 
in the REQ model. 



5.5 Nimbus and Models of the Environment 

When evaluating RSML specifications in Nimbus, the analyst has great freedom 
in how he or she models the environment. When we evaluated the REQ model in 
Section 5.2, we used text files or a software simulation of the physical process to 
provide the RSML model with monitored variables and to evaluate the controlled 
variables. As the 1N“^ and OUT“^ relations are added to the RSML model, the 
data provided (and consumed) by the model of the embedding environment must 
be rclincd to reflect the software inputs and outputs (INPUT and OUTPUT) 
instead of the monitored and controlled variables. This can be achieved in two 
ways; (1) rchne the model of the physical process to produce INPUT and con- 
sume OUTPUT, or (2) add explicit models of the sensors and actuators to the 
simulation. In reality, the refinement of the environmental model and the SOFT 
relation progress in parallel and is an iterative process. The sensor and actuator 
models may be added one at a time and the interaction with different compo- 
nents may merit different refinement strategies. NiMBUS naturally allows any 
combination of the approaches mentioned above to be used. 

In the case of the Altitude Switch, to simulate the refined SOFT relation 
(Figure 10) we modified our Excel model of the physical environment to pro- 
duce digital and analog altitude measures (Figure 13(a)). The refinement was 
achieved by simply making Excel provide the three altitudes and applying the 
two transfer functions defined in Section 5.4 before the output was sent to the 
RSML model. Adding measurement errors to the sensor models can further re- 
fine the simulation of the ASW. For instance, by modifying the computation of 
the digital altimeter outputs to 



, Altitude 

UiqitalAlt = h e 

^ 8192 

where e is some normally distributed random error (easily modeled using stan- 
dard functions in Excel) , we can provide a more realistic simulation that includes 
the natural noise in the data from the altimeters. 

As an alternative to refining the Excel model to include the altimeter models, 
we can explicitly add altimeter models to the simulation (Figure 13(b)). In our 
case, we added altimeter models expressed in RSML. By adding explicit models 
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Fig. 13. Refined models of the environment; (a) using Excel to simulate the physical 
process as well as the sensors and (b) using Excel to simulate the physical process and 
RSML models to model the sensors. 

of the sensors and actuators, we can easily explore how the software controller 
reacts to simulated sensor and actuator failures. Note that the integration of var- 
ious different models with the RSML simulation of the control software does not 
require any modifications of the RSML model, the channel architecture of Nim- 
bus allows the analyst to easily interchange the component models comprising 
the environment. 

As the refinement of the SOFT relation and the models of the environment 
progresses, we may at some point desire to perform hardware-in-the-loop simu- 
lation. Such simulations are easily accommodated in the Nimbus framework. If 
we want to evaluate the ASW software requirements interacting with the actual 
hardware components, we can use any standard data acquisition card to access 
the hardware^. Nimbus provides a collection of sample interfaces to the data 
acquisition card that can be easily modified to communicate with the desired 
hardware components. In the case of the ASW, we may want to take actual input 
from two digital altimeters, use a software simulation for the device of interest. 
(Figure 14)^. 




Fig. 14. An example of how hardware-in-the-loop simulation is achieved in the Nimbus 
framework. 



^ Currently, we are using the National Instruments DAQ 1200 series modules. 

^ Note that we have not yet had the opportunity to use actual digital altimeters in 
our simulations. 
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The use of hardware-in-the-loop simulation does not only provide a powerful 
evaluation of the proposed software system, we can also use Nimbus to evalu- 
ate the physical system itself. For instance, by forcing the RSML model of the 
software requirements into unexpected and/or hazardous states, we can inject 
simulated software failures into the hardware system. 

To summarize, Nimbus provides a flexible framework in which a software 
requirements model expressed in RSML can be executed while it interacts with 
various models of the other components in a proposed system. Nimbus supports 
the refinement of the REQ relation to a SOFT relation and allows easy inter- 
change of components in the environment as the refinement takes place. It is 
important to recognize the difference between models which are good for repre- 
senting the process versus models, like RSML models, which are good for mod- 
eling the software control of the process. Modeling the process itself accurately 
may require complex numerical functions, for example, generating normally dis- 
tributed random errors. These types of functions are not and should not be 
within the scope of RSML. However, an accurate model of the process is key 
to the success of specification-based prototyping. This is the reason that NIM- 
BUS provides the flexibility to integrate with many different models expressed in 
various ways, and one of the primary contributions of the NiMBUS environment. 



6 Conclusion 

Specification-based prototyping is an approach to requirements specification and 
evaluation that integrates the advantages associated with a readable and formal 
requirements specification with the power of rapid prototyping, while at the 
same time eliminates many of the current drawbacks with rapid prototyping. 

To support our approach, we have developed the Nimbus environment in 
which the requirements specification can be executed. In this flexible frame- 
work, software requirements models expressed in RSML can interact with either 
(1) user input or text files, (2) high-level RSML models of the components in the 
environment, (3) software simulations of the components (at varying levels of re- 
finement), or (4) the actual physical components in the target system (hardware 
in the loop simulation) . Since we support the execution of requirements models 
at various levels of refinement, we can evaluate the behavior of models ranging 
from high-level systems requirements to detailed requirements of the software. 
The flexibility of the Nimbus environment allows us to seamlessly refine a sys- 
tem’s requirements model to a software requirements model while continuously 
having the ability to execute and simulate the models in a realistic embedding 
environment. Furthermore, this flexibility allows the executable specification to 
serve at the prototype of the proposed system at each level of refinement. At the 
end of the refinement process we have a fully formal, analyzable, and executable 
model to use as a basis for production software development. 

Our approach to requirements execution and system simulation has many 
advantages over previous approaches suggested for embedded systems because 
of the combination of three factors. First, RSML is a readable and easy to under- 
stand requirements modeling language. This simplicity allows the customers to 
be intimately involved in the specification and development of the requirements 
model, whereas currently they are often only involved in the evaluation of the 
executions and simulations based on a rapidly coded prototype developed in a 
standard programming language. Second, the capability to simulate the system 
as a whole enables early dynamic evaluation of system level properties such as 
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safety, robustness, and fault tolerance. Third, the executable requirements spec- 
ification is used as a high-level prototype of the proposed software. The dynamic 
behavior of the system can be evaluated through execution and simulation. Once 
this behavior is deemed satisfactory, the resulting formal requirements specifi- 
cation is guaranteed to be consistent with the behavior of the prototype, and 
the requirements can be used as a basis for development of the production sys- 
tem. The guaranteed consistency between the prototype and the requirements 
speciheation eliminates the problems of inconsistent documentation commonly 
associated with prototyping [6]. 

We are currently investigating specification-based prototyping further. We 
are gathering experience from the use of Nimbus and we are developing guide- 
lines and a process for how to effectively take advantage of the opportunities 
presented with this type of environment. 
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Abstract. The use of type casts is pervasive in C. Although casts pro- 
vide great flexibility in writing programs, their use obscures the meaning 
of programs, and can present obstacles during maintenance. Casts in- 
volving pointers to structures (C structs) are particularly problematic, 
because by using them, a programmer can interpret any memory region 
to be of any desired type, thereby compromising C’s already weak type 
system. 

This paper presents an approach for making sense of such casts, in terms 
of understanding their purpose and identifying fragile code. We base our 
approach on the observation that casts are often used to simulate object- 
oriented language features not supported directly in C. We first describe 
a variety of ways - idioms - in which this is done in C programs. We 
then develop a notion of physical subtyping, which provides a model that 
explains these idioms. 

We have created tools that automatically analyze casts appearing in C 
programs. Experimental evidence collected by using these tools on a large 
amount of C code (over a million lines) shows that, of the casts involving 
struct types, most (over 90%) can be associated meaningfully - and 
automatically - with physical subtyping. Our results indicate that the 
idea of physical subtyping is useful in coping with casts and can lead to 
valuable software productivity tools. 



1 Introduction 

In the C programming language, a programmer can use a cast to coerce the 
type of a given expression into another type. Casts offer great flexibility to a 
programmer. In particular, because C allows a pointer of a given type to be cast 
into any other pointer type, a prograininer can reinterpret the value at a memory 
location to be of any desired type. As a consequence, C programmers can - and 
often do - exploit the physical layout of structures (structs) in memory in 
various ways. Moreover, casts come with little or no performance cost, as most 
casts do not require extra machine code to be generated. The use of casts is 
pervasive in C programs. 
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typedef struct { 
int x,y; 

} Point; 

typedef enum { 

RED, BLUE 
} color; 

typedef struct { 
int x,y; 
color c; 

} ColorPoint; 

Fig. 1. A simple example of subtypes in 
of Point 



void translateX (Point *p, int dx) { 
p->x += dx; 

} 

mainO { 

Point pt; 

ColorPoint cpt; 

translateX (&pt , 1); 
translateX ( (Point *) &cpt, 1); 

} 

: ColorPoint can be thought of as a subtype 



A major problem with casts is that they make programs dijficult to under- 
stand. Casts diminish the usefulness of type declarations in providing clues about 
the code. For example, a pointer variable can be made to point to memory of a 
type unrelated to the variable’s declared type. Another major problem is that 
casts make programs fragile to modify. Casts induce relationships between types 
that, at first glance, may appear to be unrelated to one another. As a result, it 
may not be safe for a programmer to add new fields to a struct S, because the 
code may rely on the memory layout of other structs that share a relationship 
with S through pointer casts. 

The preceding problems are exacerbated by that fact that there are currently 
no tools that assist a programmer in analyzing casts in C programs. C compilers 
do not check that the reinterpretation of memory via casts is done in a meaningful 
way. As stated before, C allows casts between any pair of pointer types. For the 
same reason, tools such as lint do not provide any help on seemingly inexplicable, 
yet legal casts. 

This paper presents a semi-automatic approach to making sense of casts that 
involve pointers to struct types. We base our approach on the observation that 
casts involving pointers to struct types can often be considered as simulating 
subtyping, a language feature not found in C. This observation is supported by 
an analysis of over 1.3 million lines of C code containing over 7000 occurrences 
of such casts. Our analysis examines each cast appearing in a program, and 
computes the relationship between the pair of C types involved in the cast. This 
relationship is usually an upcast or a downcast, but sometimes neither of the 
two. In the less frequent last case (we found 1053 total occurrences involving 
127 unique pairs), the user must inspect the participating types manually. We 
have identified several patterns of usage occurring in C code, and have found 
that the seemingly unrelated types in the last case usually fall into one of these 
patterns. 

Consider the C code shown in Fig. 1. The function translateX is defined to 
take two arguments: p (a pointer to a Point) and dx (an integer). The function 
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translates the horizontal component of the object p points to by dx units. Since 
pt is declared to be a Point, the expression translateX(&pt , 1) is legal in C. 
We may also wish to apply translateX to a variable cpt of type ColorPoint, 
but the statement translateX (&cpt , 1) is not (strictly speaking) legal in C. 
However, we can cast a pointer to a ColorPoint to be a pointer to a Point as 
shown in Fig. 1. This works because of the way values of the types Point and 
ColorPoint are laid out in memory;^ in effect, the cast of actual parameter &cpt 
in this call on translateX causes one type (i.e., ColorPoint) to be treated as a 
subtype of another type (i.e.. Point). 

The cast from ColorPoint to Point is an example of treating an instance 
of one type as an instance of another type - an “is-a” relationship. In many 
programming languages (for example, C++), this relationship can be captured 
explicitly with subtyping. However, C has no such mechanism, so users who wish 
to capture the “is-a” relationship rely on two things: type casts and the layout 
of data in memory. 

In this example, our analysis explains the cast by reporting that ColorPoint 
and Point are involved in a subtype relationship. It also points out that the types 
ColorPoint and Point may not be modified independently of each other. For 
example, one cannot add a new held at the beginning of Point, and continue to 
use the function translateX on ColorPoint, unless the same field is also added 
to ColorPoint. 

The contributions of this paper are as follows: 

— We identify how type casts and the layout of data structures are used to 
simulate various object-oriented features in C. In particular, we present sev- 
eral commonly used idioms in C programs that represent C++-style object- 
oriented constructs and discuss the role of type casts in these idioms. 

— We define the notion of physical subtyping and present rules by which the 
physical-subtype relationship may be inferred. Physical subtypes arc impor- 
tant because they provide a model that captures most of the object-oriented 
casting idioms found in C. 

— We describe a pair of software tools based on physical subtyping. The cast- 
analyzer tool classihes all the type casts in a program using the physical- 
subtype relationship. The struct- analyzer tool captures the physical subtyp- 
ing relationships between all pairs of types in a C program. As we shall 
discuss later, a programmer can use these tools in combination, both for 
understanding the purpose of casts appearing in the program, and for dis- 
covering related types in the program that must be modified consistently. 

We have run these tools on a number of large C programs taken from a variety 
of sources, including C programs from the SPEC95 benchmark suites, various 
GNU programs, and telephone call processing code from Lucent Technologies. 
Our tools and experimental results point the way to several software engineering 
applications of these tools: 

^ The ANSI C standard makes certain guarantees about the layout of the fields of 
structs in memory. In particular, the first field of all structs is always allocated at 
offset zero, and compatible common prefixes of two structs are laid out identically. 
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— To help programmers quickly learn about the relationships between data types 
in their programs. The physical-subtype relationship can be naturally shown 
as a directed graph, which can be presented to a programmer to provide a 
visualization of the relationship between data types in their programs. (See 
Section 4.3). 

— To identify fragile code. In our experience, code containing casts that violate 
the physical-subtype relationship is very fragile, because a programmer may 
introduce erroneous data references by using inconsistent type declarations. 
We present a detailed study of one such fragile cast identified in the telephone 
code (Section 5). 

— To aid in the conversion of C to object-oriented languages such as .Java and 
C-h-h. The identification of physical subtypes in C programs provides a seed 
to the process of converting C programs to C++ or Java. 

Section 2 explains several common casting idioms by which C programmers 
emulate object-oriented programming. Section 3 presents a type system for C 
and formalizes physical subtypes by presenting a collection of inference rules. 
Section 4 describes our implementation of two complementary tools that identify 
physical subtypes, and the results of applying these tools to our benchmarks. 
Section 5 discusses related work. 



2 Object-Oriented Idioms in C 

In this section, we consider several object-oriented idioms that can be found 
with perhaps surprising frequency in C programs. These idioms emulate C++ 
features, such as inheritance, class hierarchies, and virtual functions. 



2.1 Inheritance 

Redundant declarations C programmers can emulate public inheritance in 
a variety of ways. Perhaps the most common, at least for data types with a 
small number of members, is by declaring one struct type’s member list to 
have another struct type’s member list as a prefix. This is illustrated by the 
Point and ColorPoint structs appearing in Fig. 1. Instances of ColorPoint 
can be used in any context that allows the use of an instance of Point. Any valid 
context expecting a Point can, at most, refer to the Point’s x and y members. 
Any instance of ColorPoint has such x and y members at the same relative 
offsets as every instance of Point. 



First members The use of redundant declarations is perhaps the simplest 
method of implementing subtyping in C. However, making a textual copy of the 
members of a base class in the body of each derived class is both cumbersome and 
error-prone. The first-member idiom represents an improvement that alleviates 
both of these problems. 




184 



M. Siff et al. 



Subtype relationships often characterize is-a relationships, as in “a color point 
is-a point” . Members of struct types often characterize has-a relationships. For 
example, a Person has-a name: 

typedef struct { ... char *name; .••} Person; 

However, because C guarantees that the first member of an object of a struct 
type begins at the same address at which the object itself begins, the first mem- 
ber can also reflect an is-a relationship. For example, consider this alternative 
definition of ColorPoint: 

typedef struct { 

Point p; 
color c; 

} ColorPoint; 

Now a ColorPoint can be used where a Point is expected in two equivalent 
ways: 

ColorPoint cp; 

void translateX (Point *, int) ; 

translateXC (Point t)&cp, 1); 
translateX(&(cp .p) , 1); 

In the second call to translateX, the reference to the Point component of cp is 
made more explicit (at the cost of having the programmer remember the names 
of the first member and modifying such code if and when the member names 
change). 



Array padding The first-member idiom can also be implemented in a slightly 
different manner, in which the allocation of storage space for the members of 
the base class is separated from the access to those members. Consider another 
definition for ColorPoint: 

typedef struct { 

char base [sizeof (Point)]; /* storage space for a Point */ 

Color c; 

} ColorPoint; 

In this definition of ColorPoint, sufficient space is allocated to hold an en- 
tire Point rather than explicitly declaring a member of type Point (as in the 
first-member idiom) or using all the members of Point (as in the redundant- 
declaration idiom). The space is allocated by using a byte (char) array of the 
same size as Point. 

This idiom is prevalent in several large systems that we have analyzed with 
our tools (described in Sect. 4), most notably telephone and gee. 

Due to space limitations, another interesting inheritance idiom - flattening 
- is not discussed here. The reader is directed to [10] for more details. 
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typedef struct { 
} Point; 

typedef struct { 
color c; 

} AuxColor; 
typedef struct { 
Point p; 
AuxColor aux; 

} ColorPoint; 



typedef struct { 
char name [10] ; 

} AuxName ; 
typedef struct { 
Point p; 

AuxColor aux; 
AuxName aux2 ; 

} NamedColorPoint ; 



Fig. 2. A class hierarchy in C 



2.2 Class Hierarchies 

It is not uncommon to find implicit class hierarchies in C programs using one 
or more of the inheritance idioms discussed above. One interesting combination 
is to use the first-member idiom for the top level of inheritance and then to use 
the redundant-declaration idiom for deeper levels of inheritance. An example 
appears in Fig. 2. Observe that NamedColorPoint can be thought of as a subclass 
of Point by the first-member idiom and as a subclass of ColorPoint by the 
redundant-declaration idiom. Using the tools described in Sect. 4, we found that 
this idiom is prevalent in xemacs (a graphical-user-interface version of the text 
editor emacs). 

2.3 Downcasts 

It is a common object-oriented practice to allow objects of a derived class to be 
treated as if they are objects of a base class. This notion is referred to as an 
upcast - as in easting up from a subclass (subtype) to a superelass (supertype) . 
The complementary notion of a downcast is not as common, but still very useful 
in object-oriented programming. A downcast causes an object of a base class 
to be treated as an object of a derived class, or in C, casts an expression of 
supertype down to a subtype. The following is a simple example of a downcast: 

void make_red (ColorPoint* cp) { 
cp->c = RED; 

} 



ColorPoint epO; 

Point* pp; 

pp = &cp0; /* upcast from ColorPoint to Point */ 

make_red( (ColorPoint *) pp) ; /* downcast from Point to ColorPoint */ 

As this example illustrates, downcasts can be sensible in cases where type 
information has been lost through a previous upcast. The problem of identifying 
cases where downcasts are used without a preceding upcast is an aim of our 
future research. 
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typedef enum { 

CIRCLE, RECTANGLE 
} shape_kind; 

typedef struct { 
shape_kind kind; 

} Shape ; 

typedef struct { 

Shape s ; 
double radius; 

} Circle; 

typedef struct { 

Shape s ; 

double length, width; 
} Rectangle ; 



double circ_area(Circle *c) ; 
double rect_area(Rectangle *r) ; 

double area(Shape *s~) { 
switch(s->kind) { 
case CIRCLE: 

return (circ_area( (Circle *)s)); 
case RECTANGLE: 

return (rect_area( (Rectangle *)s)); 

} 

} 



Fig. 3. An example illustrating the use of explicit run-time type information to simulate 
virtual functions 



2.4 Virtual Functions 

Downcasts are necessary in order to implement virtual functions, which are one 
of the most powerful aspects of object-oriented programming. 

There are several ways in which virtual functions can be simulated in C. The 
most common is probably via the addition of run-time type information (RTTI) 
to data types in conjunction with switch statements that choose how a function 
should behave based on the RTTI. 

As an example, consider the code fragment shown in Fig. 3. In this example, 
Shape corresponds to an abstract base class. Circle and Rectangle are derived 
classes (using the first-member idiom), and the area function behaves as a virtual 
function in that it dynamically selects a specific area function to call depending 
on run-time type information stored in the kind field. 



The +1 idiom Another similar, but more complicated, mechanism for sim- 
ulating virtual functions involves the use of pointer arithmetic. This idiom is 
illustrated in Fig. 4. The example is based on a common idiom found in the 
telephone code discussed in Sect. 4. The idea in the example is that there are 
several kinds of messages that use a common message header (which includes 
run-time type information indicating the kind of message the header is attached 
to). In the processjnsg function, the argument hdr is a pointer to a message 
header, hdr + 1 is a pointer-arithmetic expression referring to the address hdr 
plus (one times) the size of the object pointed to by hdr. In other words, hdr + 
1 says “point to the the next member in the struct containing what hdr points 
to”. 
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typedef struct { 
msg_hdr hdr ; 
msgl_body body; 

} msgl; 

typedef struct { 
msg_hdr hdr; 
msg2_body body; 

} msg2; 

void processMsgl (msgl_body *) ; } 

void processMsg2(msg2_body *) ; 



void processMsg(msg_hdr thdr) { 
switch(hdr->kind) { 
case MSGl : 

processMsgl ( (msgl_body*) (hdr + 1)); 
break; 
case MSGl : 

processMsg2((msg2_body*) (hdr + 1)); 
break; 

/* ... */ 

} 



Fig. 4. The +1 idiom: An example illustrating the use of pointer arithmetic and run- 
time type information to simulate virtual functions 



By C’s type rules, the expression hdr + 1 has the same type as hdr. So 
the cast causes a pointer to type msgjidr to be treated as either a pointer to 
msgl_body or a pointer to msg2_body. Because msgJidr need not have anything 
in common with msgl_body or msg2_body, this idiom is rather confusing when 
first encountered. However, because of the way C lays out data in memory, the 
cast makes sense. ^ 

It is one thing to identify an instance of the +1 idiom; it is another thing to 
determine if such an instance makes sense in terms of subtypes. The problem of 
making sense of casts such as these is outside the scope of this paper; however, 
we plan to address it in future research. 



2.5 Generic Pointers in C 

C programmers have long made use of generic pointers to achieve a limited form 
of polymorphism and subtyping. A generic pointer is much like the class Object 
in Java, for which all classes are subclasses: all pointer types may be thought 
of as subtypes of the generic pointer type. Prior to ANSI standardization, C 
programmers used pointers to a scalar type (usually char*) to represent generic 
pointers; now void* is the accepted type for generic pointers. The use of generic 
pointers is discussed further in Sect. 3.2 and Sect. 4. 

3 Physical Subtypes 

Casts allow expressions of one type to be substituted for expressions of another 
type. In this respect, casts between types are reminiscent of subtype relationships 
often used in other programming languages (like C-|-+).^ In this section, we 

^ This assumes that the sizes and alignments of the msgjidr and msg_body types are 
such that no padding is required between the hdr and body fields. 

® Substitution is a weaker notion than subtyping since it comes with no guarantee of 
expected behavior. The fact that a compiler allows an expression of one type to be 
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define the notion of physical subtyping and present rules for determining if one 
type is a physical subtype of another. The motivation for these rules is to be 
able to automatically identify upcasts: type casts from t to t , where t can be 
thought of as a subtype of t . 

The idea behind physical subtyping is that an expression of one type may 
be substituted for an expression of another type if, when the two are laid out 
in memory, the values stored in corresponding locations “make sense” . Consider 
the following code: 

Point pt; 

ColorPoint cp; 

pt . X = 3; pt .y = 41 ; 

cp.x = 5; cp.y = 17; cp.c = RED; 

A picture of how pt and cp are represented in memory might look like: 



pt 


3 


41 




cp 


5 


17 


RED 



cp can be thought of as being of the same type as pt simply by ignoring its last 
field. 

We write t<t to denote that t is a physical subtype of type t . The intuition 
behind physical subtypes can be summarized as follows: 

— The size of a type is no larger than the size of any of its subtypes. 

— Ground types are physical subtypes of themselves and not of other ground 
types. For example: 

— int^int 

— int double 

— double char 

— an enumerated type is not a physical subtype of a different enumerated 
type (or any other ground type) 

— If a struct type is a physical subtype of another struct type then the 
members of two types line up in some sensible fashion. 



3.1 A Type System for C 

Our work addresses a slightly simplified version of the C type system: 

— We ignore type qualifiers (e.g., const int and volatile int arc treated as 
int). 

— We consider typedefs to be synonyms for the types they redehne. 

Types are described by the language of type expressions appearing in Fig. 5. 



used in place of an expression of another type does not preclude the occurrence of 
run-time errors. 
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t :: 

ground 

I 

I t ptr 

I s{mi,...,m,k} 

I u{\mi,...,mk\} 

I (C, • • • , tk) t 



/ / array of type t of size n 
I j pointer to t 
/ / struct 
/ / union 
/ / function 



m :: 

I 0 : 



/ / member labeled I of type t at offset i 
// bit field labeled Z of size n 



ground :: 

e{idi, . . .,idk} 

I void* I char 



/ / enum 

I unsigned char | short 



int I long I double | 



Fig. 5. A type system for C 



Non-bit-field members of struct and union types are annotated with an ojfset. 
In a struct, the olfset of a member indicates the dilference in bytes between the 
storage location of this member and the first member of the struct. The hrst 
member is, by definition, at offset 0. All members of union types are considered 
to have offset 0. 



3.2 Physical Subtyping Rules 

Figure 6 presents rules in the style of [6] for inferring that one type is a physical 
subtype of another. Wc consider each of the physical-subtype rules individually: 

Reflexivity: Any type is a physical subtype of itself. 

Void pointers: A pointer to type t is a physical subtype of void*. Void pointers 
are generic: they can, by definition, only be used in contexts where any other 
pointer can be used. It is illegal to dereference an object of type void*. In fact, 
the only legal operations on a void pointer that are cause for concern are type 
casts. For example: 

Bar *b 
Foo *f ; 
void *vp; 

vp = (void *)b; /* upcast: Bar* is a subtype of void* */ 
f = (Foo *)vp; /* downcast: Foo* is a subtype of void* */ 

The cast from void* to Foo* is an example of a downcast, discussed in Sect. 2.3. 
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[Reflexivity] 

[Void pointers] 
[First members] 
[Structures] 
[Member subtype] 






t ptr^void* 

t<t mi = {I, t, 0) 

{mi, . . . ,mk}^t 

k < k mi^mi . . . m^ -<mk 
{mi,...,mfc}-<{mi,...,mj, } 

m = {l,t,i) m = {I, t,i) I — I i = i t<t 



m<m 

Fig. 6. Inference rules for physical subtypes 



First members'. If t is a physical subtype of t then a struct with a first member 
(the member at offset 0) of type t is a physical subtype of t . This captures the 
first-member idiom described in Sect. 2. For example, assuming ColorPoint is 
a physical subtype of Point, then: 

typedef struct { 

ColorPoint cp; 
char tname; 

} NamedColorPoint ; 

is a physical subtype of ColorPoint as well as a physical subtype of Point. 
(This example also illustrates the transitivity of the physical-subtype relation.) 

Structures: struct s with k members is a physical subtype of struct s with k 
members if: 

— s has no fewer members than s (i.e., k < fc). (Note the contravariance be- 
tween the direction of the subtype relation and the direction of the inequality 
between k and k .) 

— Each member of s has the same label, and the same offset as each of the 
corresponding members of s, and the types of members of s are physical 
supertypes of the types of the corresponding members of s. For example, 



struct { 
int a; 

struct { double dl,d2,d3; } b; 
char c; 

} 



struct { 
int a; 

struct { double dl,d2; } b; 

} 



This is in contrast with Cardelli-style structural subtyping between record types 
([3,1]). A record type is like a struct type, but the order of members (and 
therefore the layout in memory) is unimportant. A record type {li : ti, ... ,lk : 
tfc} is a subtype of record type {li : ti, ... Sk ■ } iff each label Ij, there is 

a j such that li = Ij and U is a subtype of tj . 
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4 Implementation and Results 

In this section, we describe the basic implementation of the physical-subtype- 
analysis algorithm, as well as the cast-analyzer tool that is based on this algo- 
rithm. We then present our experimental results and discuss other applications 
of the physical-subtype-analysis algorithm. 

4.1 Implementation 

The analysis tools are written in Standard ML. The tools act on data structures 
representing C types and abstract syntax trees. The abstract syntax trees can 
be generated from any preprocessed C program. 

The core physical-subtype-analysis algorithm takes as input two types, t and 
t , and compares them to determine if t is a physical subtype of t according to 
the rules presented in Sect. 3.2. The algorithm returns a result in one of two 
forms: 

1. t is a physical subtype oft , together with numbers indicating how many 
times each of the subtyping rules have been invoked in order to identify the 
subtype relationship. 

2. t is not a physical subtype oft . 

Given an abstract syntax tree representation of a C program, the cast- 
analyzer tool proceeds by traversing the abstract syntax tree and collecting the 
pairs of types associated with every implicit and explicit cast. For each pair of 
types, t and t , involved in a cast,^ it returns one of three possible results: 

1. Upcast: If the core physical-subtype-analysis algorithm returns that t is a 
subtype of t , then the tool returns “upcast” . 

2. Downcast: If the core algorithm returns that t is not a physical subtype of t , 
then the core algorithm is applied to see to whether t is a physical subtype 
of t. If the algorithm returns that t is a subtype of t then the tool returns 
“downcast” . 

3. Mismatch: If the core algorithm determines that t is not a physical subtype 
of t and t is not a physical subtype of t, then the tool returns “mismatch” . 

The output of the cast-analyzer tool is a list consisting of the following, for each 
occurrence of a cast: 

— The location in the file where the cast occurred. 

— The type being cast from. 

— The type being cast to. 

— The result of the cast analysis: upcast, downcast, or mismatch. 

If the cast analysis results in an upcast or downcast, then the tool outputs, 
along with the above information, numbers indicating how many times each of 
the subtyping rules have been invoked in order to identify the physical-subtype 
relationship. 

^ The cast appears in the program as t ptr to t ptr, because in C, references to 
structs are stored as pointers. 
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Table 1. Total counts of casts in benchmarks. kLOC is the number of source 
lines (in thousands) in the program, including comments. Casts is the total number 
of occurrences of casts, implicit and explicit, in the program. The remaining columns 
break this total down as follows: Scalar is the number of casts not involving a struct, 
union, or function pointer. FunPtr is the number of function-pointer casts. Void- 
Struct represents casts in which exactly one of the types includes a struct or union 
type (the other being a pointer type such as void* or char*. Struct-Struct represents 
casts in which both types include a struct or union type. Each of these two categories 
is further classified as an upcast (U), downcast (D), or mismatch (M), as specified by 
the physical-subtype algorithm. There are a total of 7,796 Void-Struct and Struct- 
Struct casts. Of these casts, 1,053 are classified as mismatches 





Void-Struct 


Struct-Struct 


Benchmark 


kLOC 


Casts 


Scalar 


FunPtr 
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D 


M 


binutils 
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2,088 


41 
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32 


32 156 


0 
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288 
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79 


39 


26 


5 


gcc 
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4,882 


19 
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3 
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0 


137 


telephone 


no 


598 


42 


0 
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23 


0 


32 


66 
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bash 


76 
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346 


126 


44 


78 


1 


17 


22 


8 
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67 
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42 


495 


83 


14 


40 


9 


1 
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31 
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14 


15 


59 


0 


|464 137 


0 


perl 


27 


325 


204 


5 


60 


41 


0 


0 


3 


12 


xkernel 


37 


3,148 


1,021 


66 


771 


702 409 


64 


95 


20 


Total 


1,360 


23,947 


15,704 


447 


|2,753 2,788 538| 


|688 514 


515 



4.2 Experimental Results 

We applied the cast-analyzer tool to a number of C programs from the SPEC95 
benchmarks {gcc, ijpeg, perl, vortex), as well as networking code xkernel, GNU’s 
bash, binutils, and xernacs, and portions of a Lucent Technologies’ product (iden- 
tified here as telephone). 

Table 1 summarizes the various benchmarks analyzed, in terms of their size, 
the total number occurrences of casts, and types of casts, as classified by the 
cast-analyzer tool. Table 2 presents the cast numbers, but only counts casts 
between unique pairs of types. The number of casts in these programs, which 
represent a widc-varicty of application domains, is non-trivial. Furthermore, we 
sec that a large number of the casts arc between pointers to structs, evidence 
that programmers must reason about the physical-type relationships between 
structs. Of these casts, the majority are upcasts and downcasts, but a substan- 
tial number are mismatches as well (i.e., there is no physical-subtype relationship 
between the two types at a cast between pointer-to-struct types). 

Notice that a very high percentage (91 %) of the 1487 unique casts involving a 
struct type (see the last six columns of Table 2) can be classified automatically 
as either upcasts or downcasts. Furthermore, on simple manual inspection of 
mismatches (discussed next), most of them turned out to be idioms indirectly 
involving physical subtyping. In only a very small number of cases (fewer than 
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Table 2. Cast counts, for unique pairs of types 
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20) did a cast involving a pointer to struct appear completely unrelated to 
physical subtypiiig. These numbers provide evidence that the idea of physical 
subtyping is very useful in coping with casts appearing in C programs, and that 
the process of relating casts to subtyping can largely be automated. 



Examination of Mismatch Casts After running the cast-analyzer tool, we 
examined manually all of the cases for which it reported a mismatch. This ex- 
posed a number of questionable usages and interesting idioms (including the 
“+1” idiom described in Sect. 2.4), some of which we report on below. The mis- 
matches reported under the Void-Struct category were primarily due to the 
use of the type qualifier const: the cast analyzer reports a mismatch when an 
struct S * is cast into a const void* (or vice versa). There are a small num- 
ber of mismatches due to other reasons. In gcc and xemacs, sometimes there is a 
cast to or from a partially defined structure,® which is reported under the Void- 
Struct category. We believe that in these cases, partially defined structures are 
used as a substitute for void*. In xkernel, the return type of certain functions 
ought to be valid pointers under normal conditions, but carry an enum signifying 
a status code under special conditions. Pointer values are thus compared to enum 
constants using a cast. Clearly, this usage is unrelated to physical subtyping. 

The mismatches under the Struct-Struct category are more interesting. 
Most of them fall into one of the following patterns. 

— A pointer to a union is cast to (or from) a pointer to one of the possible fields 
within the union. The cast-analyzer tool does not compare a union type to 

® In C, one can reserve a structure tag name by declaring struct t; The name t is 
reserved as a tag name in the scope in which this declaration appears. The structure 
need not actually be defined anywhere at all. Such names are called partially defined 
structures, and are used, for example, to define a pair of structures that contain 
pointers to each other. 
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any other type except a void*. The selection of the “current interpretation” 
of the union is an orthogonal but important issue. We also found variations 
on this theme, such as a struct with a union as its last field, being cast 
into another struct with the same sequence of fields except the last one; 
the last field of the latter struct was one of the possible fields in the union. 

— Upcast and downcast in the presence of bit-fields. The cast-analyzer tool 
does not identify physical-subtype relationships in the presence of bit-fields, 
because their memory layout is implementation dependent. 

— The two structs participating in the cast have a common prefix but then di- 
verge. Consider an example from hash. There are several variants of a struct 
command, such as f or_command, while_command, and simple .command. All 
these structs have a common first field. A function that needs to examine 
only the first field accepts all the variants of the command structs by the 
following trick: it declares its formal argument to type simple .command*, 
and at call sites the actual argument is cast to type simple.command*. (An 
alternative would have been to declare a new base struct type containing 
only the first field. All the command variants would then appear as subtypes 
of the base type, and it would then be possible to make the polymorphic 
nature of the function more explicit, by declaring its formal parameter to be 
a pointer to the base type.) 

— The “-I-1” idiom, as described previously in Sect. 2.4. 

— The array padding idiom, as described previously in Sect. 2.1. 

The last three patterns also relate to physical subtyping, albeit indirectly. In 
each of them we can identify a pair of types in play, such that one type acts as 
a base type and another a physical subtype. For example, in the “+1” idiom, if 
the cast converts a type A into type B, we can think of the base type as A and 
the subtype as struct { A a; B b; }. The subtyping relation in these patterns 
cannot be inferred by the rules in Sect. 3.2. 

For a small number of exceptions (4 mismatches in gcc, 1 in telephone, 3 in 
xkernel, and 3 in perl code), we could not find any explanation at all. 

Telephone Code This section discusses a mismatch found by the cast-analyzer 
tool when applied to telephone, a large software system for call processing. (The 
code presented here is not the actual code analyzed, but a distilled version that 
illustrates the essential features.) This mismatch highlights a potentially dan- 
gerous coding style that exists in this code. 

Message passing is the common communication mechanism for telephone 
switching systems, which arc massive distributed systems. Such a system may 
contain over a thousand different kinds of messages. Message formats in these 
systems generally follow the header-body paradigm: a header contains meta- 
information about the message; the body contains the contents of the message. 
The body itself may consist of another message, and so on. Messages are specified 
using structs and unions. 

Typically, a “dispatch” procedure receives a message from the operating sys- 
tem. Depending on the contents of the header, the dispatcher will call other 
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procedures that deal with specialized sets of messages and expect a pointer to a 
particular kind of message to be passed as an argument. Often, the dispatcher 
will “look ahead” into the body of a message to find a commonly occurring 
case that requires immediate handling. For example, we found such a dispatch 
procedure that declared its view of messages as: 

struct { 
header hdr; 
union { 

Msgl ml; 

Msg2 m2; 

struct { int x; int y; } m3; 

} body; 

} M; 

There arc three kinds of messages that can be nested inside Message, repre- 
sented by M . body . ml ,M . body . m2 , and M . body . m3 . The first and second messages 
reference typedef’d structs. Message m3 is declared inline. Now, the dispatcher 
contains the following code: 

if (M. hdr. tag == 3 kk M. body. m3. x == 1) 

process_m3(&M) ; /* implicit cast */ 

where the function processjn3 expects a pointer to the following structure: 

typedef struct { 

header h; int x; char c; int y; 

} Msg3; 

The cast-analyzer tool flagged the implicit cast at the call as a “mismatch” 
because the type of the field c of Msg3 does not match type of the field y of the 
anonymous struct represented by M. body. m3. Clearly, the code implies that 
these two types represent the same message, yet they are incompatible. If the 
dispatcher were to access field m3.y and the procedure process_m3 had accessed 
(&M)->c, a physical type error would occur. A programmer simply examining 
the dispatcher, oblivious to this problem, could easily insert a reference to m3.y. 



Identifying Virtual Functions in ijpeg ijpeg provides a set of generic image- 
manipulation routines that convert an image from any one of a set of input for- 
mats to any one of a set of output formats (although the JPEG file format is the 
usual input or output type). The image-manipulation routines are written in a 
fairly generic fashion, without reference to any specihe image format. Compo- 
nents of an image are accessed or changed via calls through function pointers that 
are associated with each image object. The program initially sets up the input- 
image and the output-image objects with functions that arc format-specific, and 
then passes pointers to the image objects to the generic image-manipulation 
routines. 

The main function and the various ijpeg functions that it calls have no no- 
tion of the specihe input-image type with which they are dealing. The selection 
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of the input image type and the initialization of the relevant function pointers 
and data structures of instances of the jpeg_compress_struct type are done 
during the call to the select_f ile_type function. This separation simulates 
the object-oriented idiom of using abstract base classes and virtual functions to 
build extensible software libraries. Each of the image- format-specific functions 
performs a downcasL when the function is entered. By examining these down- 
casts, which were identified by the cast-analyzer tool, we were able to track down 
the virtual-function idiom in ijpeg. 



4.3 Other Applications of Physical Subtypes 

Given an abstract syntax tree representation of a C program, the struct- analyzer 
tool proceeds by traversing the abstract syntax tree and collecting a list of every 
struct type defined in the program. For each pair of struct types, t and t , the 
physical-subtype algorithm is used to determine to if t is a subtype of t . The 
result of a struct analysis is a list consisting of the following, for each pair of 
struct types for which some subtype relation has been identified: 

— The subtype. 

— The supertype. 

— A list of numbers indicating how many times each of the subtyping rules 
have been invoked in order to identify the subtype relationship. 

For the C-to-C++ conversion problem, the struct- analyzer tool can help 
identify potential class hierarchies. It is also a good complement to the cast- 
analyzer tool. Sometimes, implied subtype relationships are obfuscated by casts 
to and from generic types (usually void*). Struct analysis can assist the manual 
tracking of such relationships. For example, given the definitions of Point and 
ColorPoint shown in Fig. 1, struct-analyzer pioduccs the following output: 

Subtype Rules 
Subtype Supertype Reflex Struct 



ColorPoint Point 2 1 

This indicates that ColorPoint is a subtype of Point by one use of the structure 
rule and two uses of the reflexivity rule. 

For analyzing larger systems, it is often useful to visualize the results of 
physical-subtype analysis graphically. The output of the struct- analyzer tool can 
be displayed as a graph where vertices represent structs and there is an edge 
from t to t if t is a physcial subtype of t . Figure 7(a) shows a small example 
of such a graph from the SPEC95 benchmark vortex. This graph shows a small 
“class hierarchy” : The class hierarchy is a tree with base class typebasetype and 
derived classes integerdesetype, typedesetype, enumdesetype, and chunk- 
desetype, which has as a physical subtype f ieldstructype. 

The output of the cast-analyzer tool is also suitable for visualization as a 
graph. In this case, the vertices might represent types and edges upcast and 




Coping with Type Casts in C 197 




chunkdesctype 



fieldstructype 



ptr{DblPtrRect) 

• ptr(ArrayRect) 


D rawObj) 

. ptr{lntChunkRect) 


• ptr(XyRect) 


\ 




/pttWchunkRect) 




\ \ 

\ • ptr(NamedXyRect) 


' • ptr(Rectangle) 

• ptr(VarrayRect) 

• ptr(RefRect) 



(b) 



Fig. 7. Graphical displays of physical subtype relations: (a) an example of the physical- 
subtype relation for vortex', (b) an example of a set of upcasts found in vortex 



downcast relationships. Figure 7(b) shows a set of upcasts found in the vortex 
benchmark. In this graph, a number of pointer types are cast to Ptr (DrawObj ) , 
which is a pointer to struct DrawObj . 

5 Related Work 

The idea of applying alternate type systems to C appears in several places, 
among them [5,12,9,11,13]. Most of these references discuss the application of 
■parametric polymorphism to C, while in this paper we discuss the application of 
subtype polymorphism to C. The related work section in [11] describes related 
work pertaining to the application of parametric polymorphism to C. 

The type system developed in this paper has similarities with several type 
systems proposed by Cardelli [2,3,1]. The primary difference is that we take into 
account the physical layout of data types when determining subtype relation- 
ships, while in Cardelli’s work the notion of physical layout does not apply. In 
particular, there are differences between our notion of struct subtyping and 
Cardelli’s notion of record subtyping. In Cardelli’s formulation, a record r is a 
subtype of a record r if the set of labels occurring in r is a subset of those 
occurring in r and if the type of the members of r are supertypes of their cor- 
responding members in r. In our system, a struct s is a subtype of a struct 
s if the set of labels and their offsets of the members of s is a subset of those 
occuriiig in ,s, the types of all but the last member of s match the corresponding 
types in s (i.e., are supertypes and subtypes of the corresponding types in .s), and 
the type of the last member of s is a supertype of the corresponding member of 
s. (See Sect. 3.2.) 

The tools we have developed based on physical-subtyping are related to, 
but complementary to, such tools as lint [8,7] and LCLint [4]. Our tools, as 
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well as lint and LCLint, can be used to assist in static detection of type errors 
that escape the notice of many C compilers. LCLint can identify problems and 
constructs that our system cannot - for example, problems with dereferencing 
null pointers - but only by requiring the user to add explicit annotations to 
the source code. On the other hand, neither lint nor LCLint has any notion of 
subtyping. Lint and LCLint can improve cleanliness of programs. Our tools can 
not only improve cleanliness, but can also help recognize fragile code. 
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Abstract. To function on programs written in languages such as C that 
make extensive use of pointers, automated software engineering tools re- 
quire safe alias information. Existing alias-analysis techniques that are 
sufficiently efficient for analysis on large software systems may provide 
alias information that is too imprecise for tools that use it: the impreci- 
sion of the alias information may (1) reduce the precision of the infor- 
mation provided by the tools and (2) increase the cost of the tools. This 
paper presents a flow-insensitive, context-sensitive points-to analysis al- 
gorithm that computes alias information that is almost as precise as that 
computed by Andersen’s algorithm - the most precise flow- and context- 
insensitive algorithm - and almost as efficient as Steensgaard’s algorithm 
- the most efficient flow- and context-insensitive algorithm. Our empiri- 
cal studies show that our algorithm scales to large programs better than 
Andersen’s algorithm and show that flow-insensitive alias analysis algo- 
rithms, such as our algorithm and Andersen’s algorithm, can compute 
alias information that is close in precision to that computed by the more 
expensive flow- and context-sensitive alias analysis algorithms. 

Keywords: Aliasing analysis, points-to graph, pointer analysis. 



1 Introduction 

Many automated tools have been proposed for use in software engineering. To 
function on programs written in languages such as C that make extensive use of 
pointers, these tools require alias information that determines the sets of memory 
locations accessed by dereferences of pointer variables. Atkinson and Griswold 
[2] discuss issues that must be considered when integrating alias information 
into whole-program analysis tools. They argue that, to effectively apply the 
tools to large programs, the alias-analysis algorithms must be fast. Thus, they 
propose an approach that uses Steensgaard’s algorithm [16], a flow- and context- 
insensitive alias-analysis algorithm^ that runs in near-linear time, to provide 
alias information for such tools. However, experiments show that, in many cases, 
Steensgaard’s algorithm computes very imprecise alias information [13,18]. This 
imprecision can adversely impact the performance of whole-program analysis. 

^ A flow-sensitive algorithm considers the order of statements in a program; a flow- 
insensitive algorithm does not. A context-sensitive algorithm considers the legal 
call/return sequences of procedures in a program; a context-insensitive algorithm 
does not. 

O. Nierstrasz, M. Lemoine (Eds.); ESEC/FSE’99, LNCS 1687, pp. 199-215, 1999. 

(c) Springer- Verlag Berlin Heidelberg 1999 
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Whole-program analysis can be affected by imprecise alias information in 
two ways. First, imprecise alias information can decrease the precision of the 
information provided by the whole-program analysis. Our preliminary experi- 
ments show that the sizes of slices computed using alias information provided by 
Steensgaard’s algorithm can be almost ten percent larger than the sizes of slices 
computed using more precise alias information provided by Landi and Ryder’s 
algorithm [11], a flow-sensitive, context-sensitive alias-analysis algorithm. Sec- 
ond, imprecise alias information can greatly increase the cost of whole-program 
analysis. Our empirical studies show that it can take a sheer five times longer 
to compute a slice using alias information provided by Steensgaard’s algorithm 
than to compute the slice using alias information provided by Landi and Ryder’s 
algorithm; similar results are reported in [13]. These results indicate that the ex- 
tra time required to perform whole-program analysis with the less precise alias 
information might exceed the time saved in alias analysis with Steensgaard’s 
algorithm. 

One way to improve the efficiency of whole-program analysis tools is to use 
more precise alias information. The most precise alias information is provided 
by flow-sensitive, context-sensitive algorithms (e.g., [5,11,17]). The potentially 
large number of iterations required by these algorithms, however, makes them 
costly in both time and space. Thus, they are too expensive to be applicable 
to large programs. Andersen’s algorithm [1], another flow-insensitive, context- 
insensitive alias-analysis algorithm, provides more precise alias information than 
Steensgaard’s algorithm with less cost than flow-sensitive, context-sensitive al- 
gorithms. This algorithm, however, may require iteration among pointer-related 
assignments^ (O(n^) time where n is the program size), and requires that the 
entire program be in memory during analysis. Thus, this algorithm may still be 
too expensive in time and space to be applicable to large programs. 

Our approach to providing alias information that is sufficiently precise for 
use in whole-program analysis, while maintaining efficiency, is to incorporate 
calling-context into a flow-insensitive alias-analysis algorithm to compute, for 
each procedure, the alias information that holds at all statements in that proce- 
dure. Our algorithm has three phases. In the first phase, the algorithm uses an 
approach similar to Steensgaard’s, to process pointer-related assignments and 
to compute alias information for each procedure in a program. In the second 
phase, the algorithm uses a bottom-up approach to propagate alias information 
from the called procedures (callees) to the calling procedures (callers). Finally, 
in the third phase, the algorithm uses a top-down approach to propagate alias 
information from callers to callees.^ 

This paper presents our alias-analysis algorithm. The main benefit of our 
algorithm is that it efficiently computes an alias solution with high precision. 
Like Steensgaard’s algorithm, our algorithm efficiently provides safe alias infor- 



^ A pointer-related assignment is a statement that can change the value of a pointer 
variable. 

^ Future work includes extending our algorithm to handle function pointers using an 
approach similar to that discussed in Reference [2j. 
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mation by processing each pointer-related assignment only once. However, our 
algorithm computes a separate points-to graph for each procedure. Because a 
single procedure typically contains only a few pointer-related variables and as- 
signments, our algorithm computes alias sets that are much smaller than those 
computed by Steensgaard’s algorithm, and provides alias information that is al- 
most as precise as that computed by Andersen’s algorithm. Another benefit of 
our algorithm is that it is modular. Because procedures in a strongly-connected 
component of the call graph are in memory only thrice — once for each phase 
— our algorithm is more suitable than Andersen’s for analyzing large programs. 

This paper also presents a set of empirical studies in which we investigate (a) 
the efficiency and precision of three flow-insensitive algorithms — our algorithm, 
Steeiisgaard’s algorithm, Andersen’s algorithm — and Landi and Ryder’s flow- 
sensitive algorithm [11], and (b) the impact of the alias information provided by 
these four algorithms on whole-program analysis. These studies show a number 
of interesting results: 

— For the programs we studied, our algorithm and Andersen’s algorithm can 
compute a solution that is close in precision to that computed by a flow- and 
context-sensitive algorithm. 

— For programs where Andersen’s algorithm requires a large amount of time, 
our algorithm can compute the alias information in time close to Steens- 
gaard’s algorithm; thus, it may scale up to large programs better than An- 
dersen’s algorithm. 

— The alias information provided by our algorithm, Andersen’s algorithm, and 
Landi and Ryder’s algorithm can greatly reduce the cost of constructing 
system-dependence graphs and of performing data-flow based slicing. 

— Our algorithm is almost as effective as Andersen’s algorithm and Landi and 
Ryder’s algorithm in improving the performance of constructing system- 
dependence graphs and of performing data-flow based slicing. 

These results indicate that our algorithm can provide sufficiently precise alias 
information for whole-program analysis in an efficient way. Thus, it may be the 
most effective algorithm, among the four, for supporting whole-program analysis 
on large programs. 

2 Flow-Insensitive and Context-Insensitive Alias-Analysis 
Algorithms 

Flow-insensitive, context-insensitive alias-analysis algorithms compute alias in- 
formation that holds at every program point. These algorithms process pointer- 
related assignments in a program in an arbitrary order and replace a call state- 
ment with a set of assignments that represent the bindings of actual parameters 
and formal parameters. The algorithms compute safe alias information (points- 
to relations): for any pointer-related assignment, the set of locations pointed 
to by the left-hand side is a superset of the set of locations pointed to by the 
right-hand side. 
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1 int *bufl, *buf2; 

2 main() { 

3 int input[10]; 

4 int i, *p, *q, *r; 

5 init(); 

6 p = input; 

7 q = bufl; 

8 for( i=0;i<10;i++ ) ( 

9 *q = *p; 



10 ; p = incr_ptr(p») ; 

11 : q = incr_ptr(q); 

12 1 

13 q = buf2; 

14 : r = incr_ptr(q); 



jptr = p; p=incr_ptr; | 
-| ptr = q; q=incr_ptrr] 



= q; r=incr_ptr; | 



15} 



16 void init() { 

17 bull = (int *)malloc(20); 

18 but^ = (int *)malloc(20); 

19 } 



20 int *incr_ptr(int *ptr) { 

21 return ptr+i; 

22 } 

I incr_ptr = ptr; | 



(a) 
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Fig. 1. Example program (a), points-to graph using Steensgaard’s algorithm (b), 
points-to graph using Andersen’s algorithm (c). 

We can view both Steensgaard’s algorithm and Andersen’s algorithm as 
building points-to graphs [14] Vertices in a points-to graph represent equiv- 
alence classes of memory locations (i.e., variables and heap-allocated objects), 
and edges represent points-to relations among the locations. 

Steensgaard’s algorithm forces all locations pointed to by a pointer to be in 
the same equivalence class, and, when it processes a pointer-related assignment, 
it forces the left-hand and right-hand sides of the assignment to point to the same 
equivalence class. Using this method, when new pointer-related assignments are 
processed, the points-to graph remains safe at a previously-processed pointer- 
related assignment. This method lets Steensgaard’s algorithm safely estimate 
the alias information by processing each pointer-related assignment only once. 

Figure 1(b) shows various stages in the construction of the points-to graph 
for the example program of Figure 1(a) using Steensgaard’s algorithm. The top 
graph (labeled (b.l))) shows the points-to graph in its initial stage, where all 
pointers, except input, point to empty equivalence classes. When Steensgaard’s 
algorithm processes statement 6, it merges the equivalence class pointed to by 
input with the equivalence class pointed to by p; the merged equivalence class 

A points to graph is similar to an alias graph [3]. 
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is illustrated by the dotted box. Steensgaard’s algorithm processes statement 
7 similarly; the merged equivalence class is illustrated by the dashed box. The 
algorithm processes statements 10, 11, and 14 by simulating the bindings of 
parameters and return values with the assignments shown in the solid boxes 
in Figure 1. The middle graph (labeled (b.2)) shows the points-to graph after 
Steensgaard’s algorithm has processed main{). 

To represent the objects returned by malloc(), when Steensgaard’s algo- 
rithm processes statements 17 and 18, it uses h -{statement -number) . The bot- 
tom graph (labeled (b.3)) shows the points-to graph after Steensgaard’s algo- 
rithm processes the entire program. This graph illustrates that Steensgaard’s 
algorithm can introduce many spurious points-to relations. 

Andersen’s algorithm uses a vertex to represent one memory location. This 
algorithm processes a pointer-related assignment by adding edges to force the 
left-hand side to point to the locations in the points-to set of the right-hand 
side. For example, when the algorithm processes statement 6, it adds an edge 
to force p to point to input)]. Adding edges in this way, however, may cause 
the alias information at a previously-processed pointer-related assignment S to 
be unsafe — that is, the points-to set of S’s left-hand side is not a superset of 
the points-to set of S's right-hand side. To provide a safe solution, Andersen’s 
algorithm iterates over previously processed pointer-related assignments until 
the points-to graph provides a safe alias solution. 

Figure 1(c) shows various stages in the construction of the points-to graph 
using Andersen’s algorithm for the example program. The top graph (labeled 
(c.l)) shows the points-to graph constructed by Andersen’s algorithm after it 
processes main{). When the algorithm processes statements 10, 11, and 14, 
it simulates the bindings of the parameters using the assignments shown in 
the solid boxes. The middle graph (labeled (c.2) ) shows the points-to graph 
after Andersen’s algorithm processes statement 17. The algorithm forces h-17 
to point to bufl, which causes the alias information to be unsafe at statement 
7. To provide a safe solution, Andersen’s algorithm processes statement 7 again, 
which subsequently requires statements 11 and 14 to be reprocessed. The bottom 
graph (labeled (c.3)) shows the complete solution. This graph illustrates that 
Andersen’s algorithm can compute smaller points-to sets than Steensgaard’s 
algorithm for some pointer variables. However, Andersen’s algorithm requires 
more steps than Steensgaard’s algorithm. 

3 A Flow-Insensitive, Context-Sensitive Points- To 
Analysis Algorithm 

Our flow-insensitive, context-sensitive points-to analysis algorithm (FIGS) com- 
putes separate alias information for each procedure in a program. In this section, 
we hrst present some definitions that we use to discuss our algorithm. We next 
give an overview of the algorithm and then discuss the details of the algorithm. 
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Fig. 2. Points-to graphs constructed by FIGS algorithm. 



3.1 Definitions 

We refer to a memory location in a program by an object name [ 11 ], which 
consists of a variable and a possibly empty sequence of dereferences and field 
accesses. We say that an object name is extended from another object name 
N2 if Ni can be constructed by applying a possibly empty sequence of deref- 
erences and field accesses uj to N2] in this case, we denote Ni as £u){N2)- If N 
is a formal parameter and a is the object name of the actual parameter that is 
bound to N at call site c, we define a function Ac(£ui{N)) that returns object 
name £u(a). If N is a global, Ac{£uj{N)) returns £uj{N). 

For example, suppose that p is a pointer that points to a struct with field a 
(in the C language). Then £ (p) is *p, £ (*p) is * *p, and £ ,a{p) is {*p).a. For 
another example, if p is a formal parameter to function F, and *q is an actual 
parameter bound to p at call site c to F, then Ac{{*p) -a) returns (* =1= q).a. 

We extend points-to graphs to represent structure variables. A fi,eld access 
edge, labeled with a field name, connects a vertex representing a structure to a 
vertex representing a field of the structure. A points-to edge, labeled with , 
represents a points-to relation. In such a points-to graph, labels are unique among 
the edges leaving a vertex. Given an object name N, FIGS can find an access 
path V{N, G) in a points-to graph G\ first, FIGS locates or creates vertex uq in 
G to which N's variable corresponds; then, FIGS locates or creates a sequence of 
vertices ni and edges Cj, 1 <= i <= k, so that V{N, G) = no, Ci, U2, ■■■, Cfc, Uk is a 
path in G and labels of the edges in p match the sequence of dereferences and field 
accesses in N. We refer to Uk, the end vertex of V{N, G), as the associated vertex 
of N in G, and denote rik as V{N,G). Note that the set of memory locations 
associated with V{N, G) is the set of memory locations that are aliased to N. 
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3.2 Overview 

FIGS computes separate alias information for each procedure using points-to 
graphs. FIGS hrst computes a points-to graph Gp for a procedure P by process- 
ing each pointer-related assignment in P using an approach similar to Steens- 
gaard’s algorithm. If none of the pointer variables that appears in f* is a global 
variable or a formal parameter, and none of the pointer variables is used as an 
actual parameter, then Gp safely estimates the alias information for P. How- 
ever, if some pointer variables that appear in P are global variables or formal 
parameters, or if some pointer variables are used as actual parameters, then the 
pointer-related assignments in other procedures can also introduce aliases related 
to these variables; Gp must be further processed to capture these aliases. 

There are three cases in which pointer-related assignments in other proce- 
dures can introduce aliases related to a pointer variable that appear in P. In 
the first case, a pointer-related assignment in another procedure forces Su{g), 
where g is a global variable that appears in P, to be aliased to a memory lo- 
cation. Because FIGS does not consider the order of the statements, it must 
assume that such an alias pair holds throughout the program. Thus, FIGS must 
consider such an alias pair in P. For example, in Figure 1(a), statement 17 
forces *bufl to be aliased to this alias pair must be propagated to main() 
because main{) uses bufl. FIGS captures this type of alias pair in Gp in two 
steps: (1) it computes a global points-to graph, Ggiob, to estimate the memory 
locations that arc aliased to each possible global object name in the program; 
(2) it updates Gp using the alias information represented by Ggiob- 

In the second case, an assignment in a procedure called by P forces 
to be aliased to £oj 2 {f 2 ), where fi is a formal parameter and /2 is either a 
formal parameter or a global variable (the return value of a function is viewed 
as a formal parameter). Alias pair {£ui{fi),£uj 2 {f 2 )) can be propagated from the 
called procedure to P and can force Ac{£uji{fi)) to be aliased to Ac{£uj 2 {f 2 }) 
at call site c. For example, in Figure 1(a), statement 21 in function incr-ptrQ 
forces *incr-ptr to be aliased to *ptr. When this alias pair is propagated back 
to main{), it forces *r to be aliased to *q. FIGS maps the alias pairs related to 
the formal parameters to the alias pairs related to the actual parameters and 
updates Gp with the alias pairs of the actual parameters. 

In the third case, an assignment in a procedure that calls P forces a location 
I to be aliased to £,_j{a), where a is an actual parameter bound to / at a call site 
c to P. Alias pair {£^{a), 1) is propagated into P and forces £uj{f) to be aliased 
to 1. For example, statement 6 forces {*p,input[]) to be an alias pair in mainQ 
of Figure 1(a); (*p,input[]) is propagated into incr-ptr{) at statement 10, and 
forces {*ptr,input[]) to be an alias pair. FIGS propagates this type of alias pairs 
from the calling procedure to P and updates Gp. 

FIGS has three phases: Phase 1 processes the pointer-related assignments 
in each procedure and initially builds the points-to graph for the procedure; 
Phase 2 and Phase 3 handle the three cases discussed above. Phase 2 propagates 
alias information from the called procedures to the calling procedures, and also 
builds the points-to graph for the global variables using the alias information 
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available so far for a procedure. Phase 2 processes the procedures in a reverse 
topological (bottom-up) order on the strongly-connected components of the call 
graph. Within a strongly-connected component, Phase 2 iterates over the proce- 
dures until the points-to graphs for the procedures stabilize. Phase 3 propagates 
alias information from the points-to graph for global variables to each proce- 
dure. Phase 3 also propagates alias information from the calling procedures to 
the called procedures. Phase 3 processes the procedures in a topological (top- 
down) order on the strongly-connected components of the call graph. Phase 3 
iterates over procedures in a component until the points-to graphs for the pro- 
cedures stabilize. Because FICS propagates information from called procedures 
to calling procedures (Phase 2) before it propagates information from calling 
procedures to called procedures (Phase 3), it will never propagate information 
through invalid call/return sequences. Therefore, FICS is context-sensitive. 

The bottom graphs in Figure 2 depict the points-to graphs computed by FICS 
for the example program of Figure 1. The graphs show that, using FICS, variables 
can be divided into equivalence classes differently in the points-to graphs of 
different procedures. For example, in incr-ptr{), hA7, h_18, and input\\ are in 
one equivalence class. However, in main{), input [] is in a different equivalence 
class than h,A7 and h_18. Because FICS creates separate points-to graphs for 
main{), and incr-ptr{), it computes a more precise alias solution than 

Steensgaard’s algorithm for the example program. The graphs also show that 
FICS computes a smaller points-to set for p and q than Andersen’s algorithm 
because it considers calling-context. In the solution computed by Andersen’s 
algorithm, p must point to the locations pointed to by incrjptr under any calling- 
context; in the solution computed by FICS, p points only to the locations pointed 
to by incT-ptr when incrjptr{) is invoked at statement 10. Under such a calling 
context, incr-ptr points only to input\\. 



3.3 Algorithm Description 

Figure 3 shows FICS, which inputs V, the program to be analyzed, and outputs 
C, a list of points-to graphs, one for each procedure and one for the global 
variables. 



Phase 1: Create Points- To Graphs for Individual Procedures. In the 

first phase (lines 1-7), FICS processes the pointer-related assignments in each 
procedure Pi in V to compute the points-to graph Gp. . FICS first finds or creates 
vi=V{£ (Ihs), Gp-) and V 2 =V{£ (rhs), Gp^) for each pointer-related assignment 
Ihs = rhs. Then, the algorithm uses Merge (), a variant of the “join” operation 
in Steensgaard’s algorithm, to merge and V 2 into one vertex. Merge () also 
merges the successors of vi and V 2 properly so that the labels are unique among 
the edges leaving the new vertex. In this phase, FICS ignores all call sites except 
those call sites to memory-allocation functions; for such call sites, the algorithm 
uses h_{staternent -number) to represent the objects returned by these functions. 
Finally, FICS adds Pi to Wi and to IU 2 , and adds Gp. to C. 
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algorithm FICS 

input 7^: program to be analyzed 

output C\ a list of points-to graphs, one for each procedure, one for global variables 
declare Pi, Pj , Pfc, Pj: procedures in V 
Gp^: points-to graphs for Pi 

Wi : a list of procedures, sorted reverse-topologically on the strongly-connected 
components of the call graph 

W 2 : a list of procedures, sorted topologically on the strongly-connected components 
of the call graph 

begin FICS 

1. foreach procedure Pi in P do /*phasc 1 */ 

2. foreach pointer-related assignment Ihs = rhs do 

3. find or create v\ for Ihs, V 2 for rhs in Gp 

4. Merge(Gp^ , v \ , V 2 ) 

5. endfor 

6. Add Pi to Wi; Add Pi to W 2 ; Add Gp- to C 

7. endfor 

8. while Wi = (f) do /"^phase 2 * / 

9. Pi = remove procedure from head of W\ 

10. foreach call site c to Pj in Pi do 

11. Bind(acfnaZsc,Gp^ , formats p^,Gp^) 

12. endfor 

13. BindGlobal (pZo6a/s(Gp^ )-,Ggiob,Gp^ ) 

14. BindLoc(^/o6aZs(Gp^ ), Ggiobi gtobals{Gp^), Gp^) 

15. if Gp^ is updated then 

16. foreach Pj’s caller P^ do 

17. if Pfc not in W\ then Add Ph to W\ endif 

18. endfor 

19. endif 

20. endwhile 

21. while W 2 = 0 do /*phasc 3 */ 

22. Pj — remove procedure from head of W 2 

23. BindLoc{globals{Gpj), Gp^, globals{Gpj), Ggiob) 

24. foreach call site c from Pi to Pj do 

25. B±ndLoc{formalspj,Gpj, actualSc-Gp^) 

26. endfor 

27. if Gp- is updated then 

28. foreach Pj ’s callee Pi do 

29. if Pi not in W 2 then Add Pi to W 2 endif 

30. endfor 

31. endif 

32. endwhile 
end FICS 

Fig. 3. FICS: Flow-Insensitive, Context-Sensitive alias-analysis algorithm. 

The points-to graphs on the top of Figure 2 are constructed by FICS, in 
the first phase, for main{) (left), init{) (middle), and incr_ptr() (right) of the 
example program. Note that the points-to relations introduced by init{), such as 
the points-to relation between bufl and hA7, are not yet represented in main{Ys 
points-to graph. In the following two phases, FICS gathers alias information from 
both callecs and callers of P-i to further build Gp^ . 



Phase 2: Compute Aliases Introduced at Callsites and Create Global 
Points-to Graph. In the second phase (lines 8-20), for each procedure Pj, FICS 
computes the aliases introduced at Pi’s call sites. For each call site c to procedure 
Pj in Pj, FICS calls BindO to hnd alias pairs of where fi and 

/2 are Pj’s formal pointer parameters, using a depth-first search on Gp-. The 
search begins at the vertices associated with Pj’s formal parameters of pointer 
type, looking for possible pairs of V{£uji (/i), Gp ^ ) and 'P{£i^ 2 {f 2 ),Gpj ) that end 
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at the same vertex. This implies that £.^i{fi) is aliased to £ui 2 {f 2 )- BindO maps 
this type of alias pair back to Pi and captures the alias pairs in Gp^ by merging 
the end vertices of V{Ac{£.^i{fi)),Gp-) and P{Ac{£to 2 {f 2 )),Gp-) in Gp-. For 
example, FICS calls BindO to process the call site at statement 14 in Figure 
1. BindO finds alias pair (*ptr, *incrjptr) in Gincr-ptr- Then, it substitutes 
ptr with q and incrjptr with r, and creates an alias pair [*q, *r), and merges 

^^^d V(*T, ) ■ 

BindO also searches for 'P{£tji{f),Gpj) and V{£uj 2 {g),Gp^), where / is a 
formal pointer parameter and g is a global variable, that end at the same 
vertex. Similarly, BindO merges the end vertices of V{Ac{£ui\{f)),Gp^) and 
V{£Mg),Gp,) in Gp,. 

In this phase, FICS also calls BindGlobalO to compute the global points- 
to graph Ggiob with the alias information of Pi. BindGlobalO hnds alias pairs 
{£ll)\ { gi),£uj 2 {g 2 }), where gi and 52 are global variables, using a depth-first search 
in Gp. . The search begins at the associated vertices of global variables in Gp. 
and looks for pairs of access paths P {£uji{gi) , Gp-} and V {£ui 2 {g 2 ) ■, Gp-) that end 
at one vertex. BindGlobalO then merges the end vertices of V{£uji{gi),Ggiob) 
and V{£uj 2 {g 2 ), Ggiob) in Ggiob- For example, when FICS processes main{) in this 
phase, it calls BindGlobalO to search Gmain and finds that V{*bufl,G-rnain) 
and P{*buf2, Gmain) end at the same vertex. Thus, FICS merges V{*bufl, Ggiob) 
and V{*buf2, Ggiob) ■ 

FICS also computes the memory locations that are aliased to £uj{g), where g 
is a global. If a location I is in the equivalence class represented by V{£a>{g), Gp-)., 
then [£,^{g),l) is an alias pair. FICS calls BindLocO to look for V {£uj (g) , G p^) 
using a depth-first search. For each location I associated with V{£uj{g),Gp.), 
BindLocO merges V{1, Ggiob) with V {£^ (g) , G giob) to capture the alias pair 
{£ui{g))l) in Ggiob- For example, when FICS processes initQ in this phase, 
it merges V{hA7, Ggiob) with V{*bufl, Ggiob) because hA7 is associated with 
V{*hufl,Ginit)- After this phase, Ggiob is complete. 



Phase 3: Compute Aliases Introduced by the Calling Environment. In 

the third phase (lines 21-32), FICS computes the sets of locations represented by 
the vertices in Gp^. and completes the computation of Gp^. . FICS first computes 
the locations for vertices in Gp^. from Ggiob- Let g he & global variable that 
appears in Gp^. . FICS calls BindLocO to look for V{£a,{g),Gp.) using a depth- 
first search. BindLocO then copies the memory locations from V{£uj{g), Ggiob) 
to V{£io{g),Gp-). For example, when FICS processes main{) in the example of 
Figure 1, it copies hA7 and /i_18 from V{*bufl, Ggiob) to V{*huf 1, Gmain) - 
FICS also computes the locations for vertices in Gp^. from Gp. , given that Pi 
calls Pj at a call site G. Suppose a is bound to formal parameter / at C. FICS 
calls BindLocO to copy the locations from V{£;^{a),Gp-) to V{£uj{f),Gp-) to 
capture the fact that the aliased locations of £,jj (a) are also aliased to £^j (/) . For 
example, FICS copies input[] from V{*p, Gmain) to V{*ptr,Gmcr_ptr) because p 
is bound to ptr at statement If. After this phase, the set of memory locations 
represented by each vertex is complete. 
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Complexity of the FICS Algorithm.® Theoretically, it is possible to con- 
struct a program V that has 0(2") distinguishable locations [15], where n is 
the size of V. This makes any alias-analysis algorithm discussed in this paper 
exponential in time and space to the size of V. In practice, however, the total 
distinguishable locations in V is 0(n) and a structure in P typically has a limited 
number of fields. 

Let p be the number of procedures in P and S be the worst-case actual 
size of the points-to graph computed for a procedure. The space complexity 
of FICS is 0(p * S + n). In the absence of recursion, each procedure P is 
processed once at each phase. Thus, BindO, BindGlobal () , and BindLocO 
are invoked 0{NumOfCall + p) times. In the presence of recursion, a sin- 
gle change in Gp might require one propagation to each of P’s callers and 
one propagation to each of P’s callees. Gp changes 0{S) times, thus, BindO, 
BindGlobal O and BindLocO are invoked 0((Aum0/Can+p)*5') times. When 
the points-to graph is implemented with fast find/union structure, each invoca- 
tion of BindO, BindGlobal O, and BindLocO requires 0{S) “find” operations 
on a fast hnd/union structure with size 0{p * S). Let N be NumOfCall + p 
in the absence of recursion and N be (NumOfCall + p) * S in the presence of 
recursion. The time complexity of FICS is 0((N*S+p*S)a(N*S,p*S)), where 
a is the inverse Ackermann function. In practice, we can expect NumOfCall*S 
to be 0(n). Thus, we can expect to run FICS in time almost linear in the size 
of the program in practice. 



4 Empirical Studies 

To investigate the efficiency and precision of FICS and the impact on whole- 
program analysis of alias information of various precision levels, we performed 
several studies in which we compared the algorithm with Steensgaard’s algo- 
rithm (ST) [16], Andersen’s algorithm (AND) [1], and Landi and Ryder’s algo- 
rithm (LR) [11]. Wc used the PROLANGS Analysis Framework (PAF) [6] to 
implement, with points-to graphs, FICS, Stcensgaard’s algorithm, and Ander- 
sen’s algorithm. We used the implementation of Landi and Ryder’s algorithm 
provided by PAF. None of these implementations handles function pointers or 
setjump-longjump constructs. 

The left-hand side of Table 1 gives information about a subset of the subject 
programs used in the studies.® To allow the algorithms to capture the aliases 
introduced by calls to library functions, we created a set of stubs that simulate 
the effects of these functions on aliases. However, we did not create stubs for the 
functions that would not introduce aliases at calls to these functions because, 
in preliminary studies, we observed that using stubs forces Steensgaard’s algo- 
rithm to introduce many additional points-to relations. For example, for dixie, 

® Details of the complexity analysis for FICS can be found in [12]. 

® T-W-MC and moria are not used in Studies 2 and 3 because the sheer requires more 
than 10 hours, the time limit we set for slicing, to collect the data. 
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Table 1. Subject programs and Time in seconds to compute alias solutions. 



Program 


Lines of 
Code 


Number of 
CFG Nodes 


Number of 
Procedures 


Number of 
PR As 


1 Time(seconds) || 


ST 


FIGS 


AND 


LB. 


loader 


1132 


819 


32 


42 


0.05 


0.14 


0.16 


1.45 


ansitape 


1596 


1087 


37 


59 


0.06 


0.16 


0.19 


0.54 


dixie 


2100 


1357 


52 


149 


0.1 


0.22 


0.3 


0.92 


learn 


1600 


1596 


50 


129 


0.08 


0.2 


0.35 


1.47 


unzip 


4075 


1892 


42 


144 


0.06 


0.16 


0.19 


1.75 


smail 


3212 


2430 


59 


378 


0.48 


0.74 


2.8 


- 


simulator 


3558 


2992 


114 


83 


0.11 


0.38 


0.34 


1.43 


flex 


6902 


3762 


93 


231 


0.14 


0.42 


0.53 


410.28 


space 


11474 


5601 


137 


732 


0.62 


1.77 


4.64 


113.39 


bison 


7893 


6533 


134 


1170 


0.33 


0.78 


1.27 


- 


larn 


9966 


11796 


295 


642 


0.37 


1.2 


1.2 


- 


mpeg_play 


17263 


11864 


135 


1782 


0.92 


3.18 


4.92 


- 


espresso 


12864 


15351 


306 


2706 


4.21 


10.69 


957.16 


- 


moria 


25002 


20316 


482 


785 


2.34 


3.68 


521.82 


- 


T-W-MC 


23922 


22167 


247 


2228 


0.83 


4.41 


73.31 


- 



using stubs for the functions that would not introduce aliases at calls, FIGS com- 
putes, on average, ThruDeref Mod [18] of 29.45, whereas not using such stubs, 
it computes, on average, ThruDeref Mod of 22.10 (see Study 1). 



Study 1. In study 1, we compare the performance and precision of Steensgaard’s 
algorithm, FIGS, Andersen’s algorithm, and Landi and Ryder’s algorithm. For 
each subject program, we recorded the time required to compute the alias infor- 
mation (Time) and the average number of locations modified through dereference 
(ThruDeref Mod) [18]. 

The right-hand side of Table 1 shows the running time of the algorithms 
on the subject programs.^ We collected these data by running our system on a 
Sun Ultra 1 workstation with 128MB of physical memory and 256 MB virtual 
memory. The table shows that, for our subject programs, the flow-insensitive 
algorithms run significantly faster than Landi and Ryder’s algorithm. The table 
also shows that, for small programs, both FIGS and Andersen’s algorithm have 
running time close to Steensgaard’s algorithm. However, for the large programs 
where Andersen’s algorithm takes a large amount of time, FIGS still runs in time 
close to Steensgaard’s algorithm. This result suggests that, for large programs, 
FIGS is more efficient in time than Andersen’s algorithm. 

Figure 4 shows the average number of ThruDeref Mod for the four algorithms. 
The graph shows that, for many programs, Steensgaard’s algorithm computes 
very imprecise alias information, which might limit its applicability to other 
data-flow analyses. The graph also shows that, for our subject programs, FIGS 
computes alias solutions of ThruDeref Mod that are close to that computed by 
Andersen’s algorithm. For small and espresso, FIGS computes smaller ThruD- 
eref Mod than Andersen’s algorithm because these two programs have functions 
similar to incrjptr{) in Figure 1, on which Andersen’s algorithm loses precision 
because it does not consider calling context. The graph further shows that the 

^ Data on Landi and Ryder’s algorithm are not available for seven programs because 
the analysis required more than 10 hours, the limit we set for the analysis. 
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36.2 43.3 




Fig. 4. ThruDeref Mod for the subject programs. 

Table 2. Average number of summary edges (5) per call and average time (T) in 
seconds to compute the summary edges for a call in a system dependence graph. 



program 


1 Raw Data \ 


1 % of Steensgaard || 


\ 1 


1 FICS 


1 jOW I 


1 Attl 


1 FICS 


1 AND 


II 


S 


T 


s 


T 


s 


T 


8' 


T 


s 


T 


s 


T 


5 


T 


loader 


465 


2.3 


195 


1.1 


195 


1.1 


199 


1.1 


41.9 


47.4 


41.9 


47.8 


42.8 


50.0 


ansitape 


S80 


2.6 


533 


1.7 


431 


1.2 


400 


1.2 


60.6 


66.7 


49.0 


48.5 


45.5 


45.8 


dixie 


821 


2.5 


314 


1.5 


227 


1.1 


206 


1.0 


38.3 


58.4 


27.7 


42.5 


25.2 


40.0 


learn 


1578 


7.6 


209 


1.3 


173 


1.1 


159 


1-0 


13.3 


17.1 


11.0 


14.1 


10-1 


12.9 


unzip 


1979 


9.4 


738 


4.0 


687 


3.4 


402 


2.1 


37.3 


42.9 


34.7 


36.4 


20.3 


22.1 


srriail 


3518 


15.8 


2703 


11.3 


2260 


8.7 


- 


- 


76.8 


71.4 


64.2 


54.9 


- 


- 


simulator 


979 


2.0 


736 


1.2 


736 


1.2 


535 


1.0 


75.1 


62.4 


75.1 


62.6 


54.6 


50.3 


flex 


1156 


12.1 


620 


8.0 


579 


7.5 


550 


7.4 


53.6 


66.1 


50.1 


61.9 


47.6 


61.3 


space 


7562 


19.4 


5639 


10.4 


5525 


10.2 


3839 


7.5 


74.6 


53.4 


73.1 


52.7 


50.8 


38.5 


bison 


679 


2.6 


653 


1.6 


520 


1.1 


- 


- 


96.2 


62.4 


76.6 


43.4 


- 


- 


larn 


36726 


182.9 


9582 


38.2 


8087 


30.9 


- 


- 


26.1 


20.9 


22.0 


16.9 


- 


- 


mpeg-play 


1306 


32.2 


946 


23.9 


940 


21.8 


- 


- 


72.4 


74.2 


72.0 


67.7 


- 


- 


espresso 


13964 


121.5 


8540 


60.5 


10518 


82.9 


- 


- 


61.2 


49.8 


75.3 


68.3 


- 


- 



solutions computed by FICS and Andersen’s algorithm are very close to that 
computed by Landi and Ryder’s algorithm. This result suggests that, for many 
data-flow problems, aliases obtained using FICS or Andersen’s algorithm might 
provide sufficient precision. Note that, because Landi and Ryder’s algorithm uses 
a k-limiting technique, which collapses the fields of a structure, to handle recur- 
sive data structures [11], the points-to set for a pointer p computed by Landi and 
Ryder’s algorithm may contain locations that are not in the points-to set for p 
computed by the three flow-insensitive algorithms. Thus, Andersen’s algorithm 
provides a smaller alias solution than Landi and Ryder’s algorithm for loader 
and space. 



Study 2. In study 2, we investigate the impact of the alias information pro- 
vided by the four algorithms on the size and the cost of the construction of one 
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program representation — the system-dependence graph [10].® We study the 
average number of summary edges per call and the cost to compute these sum- 
mary edges in a system dependence graph. The summary edges are computed by 
slicing through each procedure with respect to each memory location that can 
be modified by the procedure using Harrold and Ci’s sheer [7]. Thus, the time 
required to compute the summary edges might differ from the time required to 
compute the summary edges using other methods (e.g. [10]). Nevertheless, this 
approach provides a fair way to compare the costs of computing summary edges 
using alias information of different precision levels. 

Table 2 shows the results of this study. We obtained these results on a Sun 
Ultra 30 workstation with 640MB physical memory and 1GB virtual memory. 
The table shows that using more precise alias information provided by FIGS, 
Andersen’s algorithm, and Landi and Ryder’s algorithm can effectively reduce 
both the average number of summary edges per call and the time to compute the 
summary edges in the construction of a system-dependence graph.® The table 
further shows that, for our subject programs, using alias information provided 
by FIGS is almost as effective as using alias information provided by Andersen’s 
algorithm. Our algorithm is even more effective than Andersen’s algorithm on 
espresso because our algorithm computes a smaller points-to set for the pointer 
variables. These results suggest that FIGS is preferable to Andersen’s algorithm 
in building system-dependence graphs for large programs because FIGS can run 
significantly faster than Andersen’s algorithm on large programs. 



Study 3. In study 3, we investigate the impact of the alias information provided 
by the four alias-analysis algorithms on the sizes of the slices and the cost of 
computing the slices. We obtained the slices by running Harrold and Gi’s sheer 
[7] on each slicing criterion of interest, without stored reuse information. 

Table 3 shows the results of this study. Wo obtained these results on a Sun Ul- 
tra 30 workstation with 640MB physical memory and 1GB virtual memory. The 
table shows that, for all the subject programs, using more precise alias informa- 
tion than that computed by Steensgaard’s algorithm can significantly reduce the 
time to compute a slice. The table also shows that, for four programs, using more 
precise alias information can significantly (> 10%) reduce the sizes of the slices. 
These four programs illustrate exceptions to the conclusion drawn by Shapiro 
and Horwitz [13] that the sizes of slices are hardly affected by the precision of the 
alias information. Note that for five of the programs, the sheer computes larger 
slices using alias information provided by Landi and Ryder’s algorithm than 
using that provided by FIGS and Andersen’s algorithm because the points-to 
set computed by Landi and Ryder’s algorithm for a pointer p contains memory 
locations that are not in the points-to set computed by Steensgaard’s, FIGS, or 
Andersen’s algorithms for p. The table further shows that using alias information 

® A system-dependence graph can be used to slice a program.; computing summary 
edges is the most expensive part of constructing such a graph. 

® Similar results of time were reported in [13] where Steensgaard’s, Shapiro’s and 
Andersen’s algorithms were compared. 
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Table 3. Average size of a slice (S) and average time (T) in seconds to compute a 
slice. 



program 


1 B.aw Data \ 


1 % of Steensgaard || 


\ sT 


1 FIUS 1 


1 JUW 1 


1 \ 


1 FICS 


1 AND 


II 


S 


T 


s 


T 


s 


T 


s 


T 


s 


T 


s 


T 


S 


T 


loader 


207 


5.3 


192 


3.4 


192 


3.3 


194 


3.5 


93.0 


64.1 


93.0 


63.4 


93.8 


66.5 


ansitape 


290 


16.6 


284 


9.6 


277 


5.3 


300 


4.9 


98.1 


58.1 


95.7 


32.2 


103.5 


29.7 


dixie 


705 


25.5 


704 


8.3 


704 


5.9 


699 


5.5 


99.9 


32.7 


99.9 


23.1 


99.2 


21.7 


learn 


442 


25.4 


442 


17.6 


442 


11.4 


440 


16.8 


100.0 


69.0 


99.9 


44.9 


99.5 


66.0 


unzip 


808 


37.5 


807 


13.1 


807 


10.8 


805 


9.3 


99.9 


35.0 


99.8 


28.9 


99.6 


24.9 


small 


738 


176.5 


637 


96.1 


635 


75.4 


- 


- 


86.3 


54.5 


86.1 


42.7 


- 


- 


simulator 


1258 


54.8 


1087 


22.5 


1087 


22.7 


1151 


24.2 


86.4 


41.1 


86.4 


41.3 


91.5 


44.2 


flex 


2025 


220.2 


2019 


167.3 


2019 


153.8 


2002 


159.8 


99.7 


76.0 


99.7 


69.9 


98.9 


72.6 


space 


2234 


1373.9 


1936 


573.5 


1936 


569.8 


2086 


467.3 


86.7 


41.7 


86.7 


41.5 


93.4 


34.0 


bison 


2394 


94.9 


2394 


84.1 


2338 


41.0 




- 


100.0 


88.6 


97.7 


43.2 




- 


larn 


6626 


3477.3 


6602 


1075.6 


6592 


902.4 


- 


- 


99.6 


30.9 


99.5 


26.0 


- 


- 


mpeg-play 


5708 


325.5 


3935 


134.6 


3935 


139.5 






68.9 


41.3 


68.9 


42.9 






espresso 


6297 


8332.1 


6291 


3776.5 


6264 


5367.1 


- 


- 


99.9 


45.3 


99.5 


64.4 


- 


- 



Data are collected from all the slices of the program. Data are collected from one slice. 



provided by FIGS is almost as effective as using alias information provided by 
Andersen’s algorithm in computing slices. This further supports our conclusion 
that FIGS is preferable to Andersen’s algorithm in whole-program analysis. 



5 Related Work 

Many data-flow analysis algorithms (e.g., [9,10]), including FIGS, use a two- 
phase interprocedural analysis framework: in the first phase, information is prop- 
agated from the called procedures to the calling procedures, and when a call 
statement is encountered, summaries about the called procedure are used to 
avoid propagating information into the called procedure; in the second phase, 
information is propagated from the calling procedures to the called procedures. 
Recently, Ghatterjee et al. [4] use unknown initial values for parameters and 
global variables so that the summaries about a procedure can be computed 
for flow-sensitive alias analysis. Then, they use the two-phase interprocedu- 
ral analysis framework to compute flow- and context-sensitive alias information. 
Although their algorithm can improve the worst case complexity over Landi 
and Ryder’s algorithm [11] while computing alias information with the same 
precision, it is still too costly in practice. Furthermore, because no comparison 
between these two algorithms is reported, it is not known how much Ghatterjee 
et al.’s algorithm outperforms Landi and Ryder’s algorithm. 

There have been a number of attempts to design algorithms to compute alias 
information with efhciency close to Steensgaard’s algorithm and with precision 
close to Andersen’s algorithm. Shapiro and Horwitz [14] propose a method that 
divides the program variables into k categories, and allows only variables be- 
longing to the same category to be in an equivalence class. Thus, similar to 
FIGS, this method computes smaller equivalence classes, and provides a smaller 
points-to set for each pointer variable, than Steensgaard’s algorithm. FIGS dif- 
fers from this method, however, in that it uses an independent set of equivalence 



Harrold and Rothermel used a similar approach in [8|. 
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classes for each procedure. Thus, FIGS can benefit from the fact that a proce- 
dure references only a small set of program variables. FIGS also differs from this 
method in that FIGS is context-sensitive (information is not propagated through 
invalid call/return sequences). Finally, FIGS differs from Shapiro and Horwitz’s 
algorithm in that FIGS can handle the fields of structures, whereas in their al- 
gorithm, assignments to a field of a structure are treated as assignments to the 
entire structure. Because of this last difference, it is difficult to compare our ex- 
perimental results with theirs. However, from the experimental results reported 
in Reference [14], it appears that, on average, FIGS computes alias information 
that is closer to Andersen’s in precision than their algorithm. 

6 Conclusions 

We presented a flow-insensitive, context-sensitive points-to analysis algorithm 
and conducted several empirical studies on more than 20 G programs to com- 
pare our algorithm with other alias-analysis algorithms. The empirical results 
show that, although Steensgaard’s algorithm is fast, the alias information com- 
puted by this algorithm is too imprecise to be used in whole-program analysis. 
The empirical results further show that using more precise alias information pro- 
vided by our algorithm, Andersen’s algorithm, and Landi and Ryder’s algorithm 
can effectively improve the precision and reduce the cost of whole-program anal- 
ysis. However, the empirical results also show that Andersen’s algorithm and 
Landi and Ryder’s algorithm could be too costly for analyzing large programs. 
In contrast, the empirical results show that our algorithm can compute alias in- 
formation that is almost as precise as that computed by Andersen’s algorithm, 
with running time that is within six times that of Steensgaard’s algorithm. Thus, 
our algorithm may be more effective than the other algorithms in supporting 
whole-program analysis. 

Our future work includes performing additional empirical studies, especially 
on large subject programs, to further compare our algorithm with other alias- 
analysis algorithms. We will also conduct more studies to see how the imprecision 
in the alias information computed by our algorithm can affect various whole- 
program analyses. 
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Abstract. Dynamic analysis is the analysis of the properties of a run- 
ning program. In this paper, we explore two new dynamic analyses based 
on program profiling: 

— Frequency Spectrum Analysis. We show how analyzing the frequen- 
cies of program entities in a single execution can help programmers 
to decompose a program, identify related computations, and find 
computations related to specific input and output characteristics of 
a program. 

— Coverage Concept Analysis. Concept analysis of test coverage data 
computes dynamic analogs to static control flow relationships such 
as domination, postdomination, and regions. Comparison of these 
dynamically computed relationships to their static counterparts can 
point to areas of code requiring more testing and can aid program- 
mers in understanding how a program and its test sets relate to one 
another. 



1 Introduction 

Dynamic analysis is the analysis of the properties of a running program. In 
contrast to static analysis, which examines a program’s text to derive properties 
that hold for all executions, dynamic analysis derives properties that hold for 
one or more executions by examination of the running program (usually through 
program instrumentation [14]). While dynamic analysis cannot prove that a 
program satisfies a particular property, it can detect violations of properties as 
well as provide useful information to programmers about the behavior of their 
programs, as this paper will show. 

The usefulness of dynamic analysis derives from two of its essential charac- 
teristics: 

— Precision of information: dynamic analysis typically involves instrumenting 
a program to examine or record certain aspects of its run-time state. This 
instrumentation can be tuned to collect precisely the information needed 
to address a particular problem. For example, to analyze the shape of data 
structures created by a program (lists, trees, dags, etc.), an instrumentation 
tool can be created to record the linkages among heap-allocated storage cells. 
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— Dependence on program inputs: the very thing makes dynamic analysis in- 
complete also provides a powerful mechanism for relating program inputs 
and outputs to program behavior[15]. With dynamic analysis it is straight- 
forward to relate changes in program inputs to changes in internal program 
behavior and program outputs, since all are directly observable and linked 
by the program execution. Viewed in this light, dynamic and static analy- 
sis might be better termed “input-centric” and “program-centric” analysis, 
respectively. 

Dynamic and static analyses are complementary techniques in a number of 
dimensions: 

— Completeness. In general, dynamic analyses generate “dynamic program in- 
variants”, properties which are true for the observed set of executions. [11] 
Static analysis may help determine or not these dynamic “invariants” truly 
are invariants over all program executions. In the cases where the dynamic 
and static analyses disagree, there are two possibilities: 1. the dynamic anal- 
ysis is in error because it did not cover a sufficient number of executions; 
2. the static analysis is in error because it analyzed infeasible paths (paths 
that can never execute). Since dynamic analysis examines actual program 
executions, it does not suffer from the problem of infeasible paths that can 
plague static analyses. On the other hand, dynamic analysis, by definition, 
considers fewer execution paths than static analysis. 

— Scope. Because dynamic analysis examines one very long program path, it 
has the potential to discover semantic dependencies between program enti- 
ties widely separated in the path (and in time). Static analysis typically is 
restricted in the scope of a program it can analyze effectively and efficiently, 
and may have trouble discovering such “dependencies at a distance” . 

— Precision. Dynamic analysis has the benefit of examining the concrete do- 
main of program execution. Static analysis must abstract over this domain in 
order to ensure termination of the analysis, thus losing information from the 
start. Abstraction can be a useful technique for reducing the run-time over- 
head of dynamic analysis and reducing the amount of information recorded, 
but is not required for termination. 

In this paper, wc illustrate and discuss some of these concepts of dynamic 
analysis using program profiles [3]. A program profile counts the number of 
times program entities occur in a program execution. For example, a statement 
level profile counts how many times each statement executes. Profiles can be 
recorded at many different levels, from that of objects, methods and procedures, 
down to paths, branches and even individual machine instructions. Prohling tools 
are commonplace today, with most compilers and operating systems providing 
accompanying profiling toolsets. 

We propose two new dynamic analyses based on program profiling: 

— Frequency Spectrum Analysis (FSA). The idea behind FSA is that analyz- 
ing the frequencies of program entities in a single execution can help pro- 
grammers to decompose a program, identify related computations, and find 
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computations related to specific input and output characteristics of the pro- 
gram. We demonstrate FSA on a small obfuscated C program that prints the 
poem “The Twelve Days of Christmas” . For this case study, we used path 
prohling [1] technology to monitor the execution behavior of the program. 
Based on our analysis, we created an “unobfuscated” version of the program 
that retains the original program’s profile signature and clearly explains the 
operation of the original program. 

— Coverage Concept Analysis ( CCA ). We show how concept analysis applied to 
coverage profiles naturally computes dynamic analogs to static control flow 
relationships such as domination and regions, identifying “dynamic control 
flow invariants” across a set of executions. Comparison of the dynamically 
invariant control flow relationships to their static counterparts can point to 
areas of code requiring more testing and can aid programmers in understand- 
ing how their code and test sets relate to one another. 

This paper is organized as follows. Section 2 presents the basic ideas behind 
frequency spectrum analysis and our case study of the obfuscated C program. 
Section 3 reviews concept analysis and shows the different ways in which it can 
help us to understand the relationships between tests and coverage information. 
Section 4 discusses related work. Section 5 concludes the paper. 

2 Frequency Spectrum Analysis 

This section presents the ideas behind frequency spectrum analysis (FSA) and 
then describes how this analysis was used to help understand the internal be- 
havior of an obfuscated C program. 

2.1 The Meaning of Frequencies 

The traditional use of program profiles in performance tuning is to separate the 
frequently executed parts of a program from the less frequently parts. By delving 
a bit deeper into the information in program profiles (that is, the frequencies of 
the program entities, as recorded in a profile), FSA can help a programmer in 
three basic tasks: 

— partitioning the program by levels of abstraction; 

— finding related computations; 

— hnd computations related to specihe attributes of a program’s input or out- 
put. 

In the next section, we will present our analysis of an obfuscated C pro- 
gram based on several general observations made in this section. Table 1 shows 
the path profile of the obfuscated C programs’ execution (Figure 1). Twelve 
paths executed and each path’s static identifier (composed of the procedure 
name containing the path and the path’s integer identifier in that procedure) 
and execution frequency are shown. The paths are sorted in ascending order of 
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Path ID 


Frequency 


Path ID 


Frequency 


main:0 


1 


main: 2 


114 


main: 19 


1 


main: 3 


114 


main: 22 


1 


main:l 


2358 


main: 23 


10 


main:? 


2358 


main: 9 


11 


main: 4 


24931 


main: 13 


55 


main: 5 


39652 



Table 1. A path profile of the (readable) obfuscated C program’s execution. 



frequency. We will use this path profile to motivate FSA, without reference to 
the program’s output or its code. In the next section, we will analyze how the 
paths and frequencies are related to the program’s output and structure. 

FSA is based on three simple observations about how frequencies relate to 
program behavior: 

— Low Versus High Frequencies. The relative execution frequencies of program 
entities can provide clues as to their place in the hierarchy of program ab- 
stractions. For example, the interface procedures to a sorting module gen- 
erally will be called many fewer times than the private procedures in the 
module that invoke one another to perform the sort operation. In object- 
oriented programs, methods implementing a high-level architectural pattern 
probably will have lower execution frequency than methods implementing 
the guts of an algorithm. 

In Figure 1, we immediately see that the paths main:4 and main:5 have much 
higher frequencies than the other ten paths. This indicates that these paths 
are involved in some highly repetitive computation. 

— Related Frequencies and Frequency Clusters. The fact that a procedure foo 
is called 1033 times may not be particularly noteworthy. However, the fact 
that procedures foo and bar each are called 1033 times usually is more 
than mere coincidence. This is the basic idea behind related frequencies or 
“frequency clusters” . 

The reason for such frequency clustering may be that procedure foo always 
calls procedure bar, or that there is another procedure f oobar that calls both 
foo and bar. There can be many explanations for a frequency cluster. Re- 
gardless of the underlying mechanism that created the cluster, the cluster by 
itself is an interesting hint to the programmer about dynamic relationships 
between program entities that may not be apparent in the static program 
structure. Frequency clusters partition the program many ways, slicing across 
traditional abstraction boundaries, as entities widely separated in program 
text may be related to one another through common frequency. 

Two clusters are immediately apparent in the path profile of Figure 1: paths 
main:2 and main:3 with frequency 114 and paths maiml and main:7 with 
frequency 2358. 
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#include <stdio .h> 
main(t a)char*a; { 

return! 0<t?t <3?main (-79 , -13 , a+main(-87 , l-_ ,main(-86 ,0 , a+l)+a) ) : 

1 , t<_?main(t+l , _ , a) : 3 ,main(-94, -27+t , a) &&t==2?_<13? 
main(2 , _+l , "7,s 7«d 7«d\n") ;9: 16:t<0?t<-72?main(_,t, 

" @n ’ + , # V *-C}w+/ w#cdnr/ + , -C}r/ *de}+ ,/*{*+. /w{7o+ , / w#q#n+ , / #{1+ , /n{n+ , / +#n+ , /#\ 

; #q#n+ , /+k# ; ++ , / ^ r : ’ d* ’ 3 , }-[w+K w ’ K : ^ +}e# ’ ; dq# ^ 1 \ 

q#’+d’K#!/+k#;q#^r}eKK#}w’r}eKK-Cnl] V#;#q#nO{)#}w’ ){){nl] V+#n^ ;d}rw’ i;# \ 

) {nl] ! /n{n# ’ ; r{#w'r nc{nl] '/#{1, + ’K {rw^ iK-[; [{nl] Vw#q#n’wk nw’ \ 
iwk{KK{nl] I /w{7/l##w#’ i; :{nl] V*{q#’ld;r’}{nlub!/*de}’c \ 

; [}rw] V+,}##^*}#nc, S#nw] V+kd'+e}+;#’rdq#w! nr’/ ’) }+}{rl#’-[n’ ’)# \ 

}’+>##(!!/") 

:t<-50?_==*a?putchar (31 [a] ) :main(-65,_,a+l) :main((*a==’/0+t,_,a+l) 

: 0<t?main(2 , 2 , "7oS" ) : *a== ’/’II main (0 , main (-61 , *a, 

" ! ek; dc i@bK’ (q) - [w] *7»n+r3#l , {} ; \nuwloca-0 ;m . vpbks ,f xntdCeghiry" ) , a+1) ; } 

Fig. 1. An obfuscated C program to print the poem “The Twelve Days of Christmas” . 
The partial output of the program is shown in Figure 5. 



— Specific Frequencies. Knowledge about the characteristics of program’s input 
or output can greatly aid in FSA. For example, if the output of a program 
is an enumeration of records, there is probably a program entity whose fre- 
quency is the size of this enumeration. Frequencies related to the input or 
output domain of a program can help a programmer identify those parts 
of a program responsible for input or output. This idea can be extended in 
several obvious directions. For example, one can look for frequencies that 
might indicate a 0{N‘^) algorithm, as suggested by [19]. 

As suggested above, profiles contain a wealth of information that is rarely 
exploited by programmers. Jon Bentley, in his scries of columns and books on 
writing efficient programs, discusses how execution counts “tell interesting tales” 
and can help programmers to debug misbehaving programs as well as to tune 
the performance of well behaved programs. [5,G] In the next section, we explore 
this idea in some detail through a case study. 

2.2 Case Study: Understanding an Obfuscated C Program 

Figure 1 presents an obfuscated C program that often makes the rounds during 
the holiday season (the author has received it at least twice) . The program takes 
no input^and produces the poem “The Twelve Days of Christmas”, an excerpt 
of which is presented in Figure 5 in the Appendix. 

In this section, we will show how we used FSA to help determine how the 
program accomplishes the printing of the poem and to create a new “unobfus- 

^ It should be noted that in this very special circumstance, a dynamic analysis is 
a static analysis. Nonetheless, the information compute by the dynamic analysis 
(profiles) is unavailable from conventional static analyses. 
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#include <stdio.h> 
main(t,_,a) char *a; 

{ 

if ((!0) < t) { 

[1] if (t < 3) main(-79 , -13,a+main(-87 , l-_ ,main(-86 ,0, a+l)+a) ) ; 

[2] if (t < _ ) main(t+l , _ , a) ; 

[3] main (-94 , -27+t , a) ; 

[4] if (t==2 && _ < 13 ) main(2,_+l, ; 

} else if (t < 0) { 

[5] if (t < -72) main(_,t,LARGE_STRING) ; 
else if (t < -50 ) { 

[6] if (_ == +a) putchar (31 [a] ) ; 

[7] else main(-65,_,a+l) ; 

[8] } else main( (+a==’ / ’ )+t , a+1) ; 

[9] } else if (0 < t) main (2,2, "7oS") ; 

[10] else if (*a!=’/’) main(0,main(-61 , *a, SMALL_STRING) ,a+l) ; 

} 



Fig. 2. A (more) readable version of the obfuscated C program, after reformatting, 
performing local syntactic substitutions to turn expressions into statements and elim- 
inating dead code. There are 10 lines containing calls, each uniquely numbered in 
brackets. 



cated” program that explains how the original program works. In restructuring 
the program, we maintained as much of the original program’s computational 
signature as possible. Whenever possible, we rewrote the program in the spirit 
of the original program, rather than substituting a radically different piece of 
code in place of one we didn’t happen to like. 



Making the Program Readable To understand a program, it first is helpful 
to be able to read it. The given program is barely readable, even for those very 
familiar with the C language. Our first task was to reformat the code, using 
indentation and explicit parenthesization to make it more readable, as well as 
rewriting it without the use of conditional or list expressions. Figure 2 shows the 
result of these local syntactic transformations. 

The readable obfuscated program consists of one function main with three 
arguments (t, _ and a) and calls itself repeatedly. The second argument is an 
underscore, which is a legal variable name in C. The function main truly is a 
function, as it does not update any variables. It achieves its goal based solely 
on the values passed to it. The initial invocation of the program will cause the 
value of parameter t to be 1 (because in Unix, the first argument to main is 
the count of the number of arguments on the command line including the name 
of the program itself). The program contains two strings (shown in the original 
program in Figure 1, but elided here to LARGE_STRING and SMALL_STRING, which 
appear to encode the text of the poem. 
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Path ID 


Frequency 


Condition 


Call Lines 


main:0 


1 


t == 1 


[9] 


main: 19 


1 


t==2 M t >= _ 


[1,3,4] 


main: 22 


1 


t==2 M t < _ M _ >= 13 


[1,2,3] 


main: 23 


10 


t==2 && t < _ M _ < 13 


[1,2, 3, 4] 


main: 9 


11 


t >= 3 M t >= _ 


[3] 


main: 13 


55 


t >= 3 M t < _ 


[2,3] 


main: 2 


114 


t == 0 && *a == ’/’ 


no call lines 


main: 3 


114 


t < -72 


[5] 


main:l 


2358 


t == 0 && *a != ’/’ 


[10] 


main: 7 


2358 


t > -72 && t < -50 && _ == *a 


[6] 


main: 4 


24931 


t < 0 && t >= -50 


[8] 


main: 5 


39652 


t > -72 && t < -50 kk _ != *a 


[7] 



Table 2. Summary of the twelve executed paths in the readable obfuscated C program 
of Figure 2. 



The Frequency Spectrum Analysis Before taking on a reverse engineering 
task, it is important to have some model in mind to help guide the process. 
The “Twelve Days of Christmas” is all about counting gifts, so we approach 
the poem and the program by identifying various quantities that arise from the 
poem’s natural structure: 

— 12 verses, on for each of the 12 days of Christmas. 

— 26 unique strings: there are many repeated strings in the poem. There are 
three strings for the common structure (“On the”, “day of Christmas...”, 
“and a partridge ...”), 12 strings for the ordinals, and 11 strings for the 
second through twelfth gifts, giving a total of 26 unique strings. 

— 66 occurrences of presents other than a “partridge in a pear tree” (which 
occurs in every verse). 

— 114 strings printed: 12 occurrences of the three common strings (36), 12 
ordinals, and 66 non-partridge gifts (36 + 12 + 66 = 114); 

— 2358 characters printed as output, as counted by the Unix word count utility 

wc. 

We have seen some of these frequencies before in Figure 1. Recall that the 
goal of FSA is to use the frequencies obtained from a program profile to aid in 
understanding the program. The idea is that these execution counts will help 
us identify which parts of the program are responsible for which parts of the 
poem. For example, a program element with an execution count of 11 or 12 
may indicate an entity involved in the control of the number of verses, while 
an element with an execution count of 2358 is most likely involved in printing 
characters. 
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We used the PP path profiling tool of Ammons, Ball and Larus [4,1] to 
capture intraprocedural path^execution counts of the readable program. The 
program takes no input, so there is only one path profile to consider. Table 2 
repeats the twelve executed paths in the path profile of the readable program 
from Table 1, with some additional information. For this program, each path is 
uniquely identified by the conditions on the parameters t, _ and a and by the 
lines in the path that contain procedure calls (referred to here as “call lines”). 
There arc ten lines containing procedure calls in the code in Figure 2, labelled 
in brackets. The path condition and the procedure call lines in each path are 
summarized in Table 2. 

The first thing that is apparent from Table 2 is that there is a strong corre- 
lation between a path’s frequency and the call lines that it covers. Paths with 
frequencies less than 100 cover subsets of call lines in the set { 1,2, 3, 4, 9 }, while 
each path with frequency greater than 100 covers a different call line not in this 
set. A closer examination of the code and the paths shows that the paths cluster 
into six main groups (separated by the double lines in the table), as detailed 
below: 

— Path maimO (executed once) initializes the recursion. 

— Paths main:19, main:22, and niain:23 control the printing of the 12 verses. In 
particular, path main:19 represents the first verse, path main:23 the middle 
10 verses, and path main:22 the last verse. The sum of these paths’ frequen- 
cies is 12, the number of verses in the poem. Each of the paths covers a 
different set of recursive calls to main (call lines 1-4). These paths helped 
us identify that certain calls were responsible for the first line of each verse 
(call line 1), starting the inner loop iteration to print the list of gifts (call 
line 2). printing a single gift (call line 3), as well as iterating the outer loop 
(call line 4). 

— Paths main:9 and main: 13 control the printing of the non-partridge-gifts 
within a verse. Note that the frequencies of the two paths sum to 66, as 
expected from our analysis of the poem. These paths make up the “inner 
loop” of the program. 

— Paths main:2 and main:3 are responsible for printing out a string. Each path 
has frequency 114, the exact number of strings predicted by analyzing the 
poem’s structure. The path main:3 represents the initialization (passing the 
large string in as parameter a) and the path main:2 represents the termina- 
tion of the printing of the string (when the ’/’ separator is found). 

— Paths main:l and main:? print out the characters in a string. Each path 
executes 2358 times. Why are there two paths with frequency 2358? We will 
soon see. 

— What about the anomalous paths main:4 and main:5 with the large fre- 
quencies of 24931 and 39652? Examination of the code reveals that path 
niain:4 is responsible for skipping over t sub-strings in LARGE_STRING to get 

^ Intraprocedural paths do not follow control flow from a call site to the entry of the 
called procedure. They stay in the same procedure (effectively treating the procedure 
call as if it had no effect on the control flow). 
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to the t + 1*^ sub-string. Each sub-string is terminated with the ’/’ charac- 
ter. Every time the t + 1*^ sub-string is to be printed, a linear scan through 
the large string is done to get to that sub-string, which accounts for path 
main:4’s high frequency. 

Path main:5 scans SMALL_STRING until it finds the character in it that matches 
the current character (the value of the argument _) to be printed, at which 
point path main:7 executes. The character 31 positions later in the small 
string (31 [a] , which in C is equivalent to a [31]) is the translation of the 
character. This explains why there are two paths with frequency 2358. Path 
maiml is the initiation of the search of the small string to find the character 
translation and path main:7 performs the translation and printing of the 
character. Path main:5’s high frequency is due to the fact that the small 
string is scanned each time for every character to be printed. 



The Restructured Program Using the knowledge gained from ESA and man- 
ual examination of the program, we restructured the program to produce the 
program shown in Figure 3. We strove to keep the recursive structure of the 
program intact, but used different functions to represent the different tasks of 
the original program, as captured by the clustering of the paths. We did not 
change the values of the two relevant text strings (the list of sub-strings of the 
poem, LARGE_STRING, and the translation mapping, SMALL_STRING). The origi- 
nal program used the value 2 to represent the first day of Christmas. We shifted 
this down to 1 to match the poem. 

There are seven functions in the new program, corresponding closely to the 
clusters of paths identified in the old program: 

— main (path niain:0); 

— outer_loop (paths main:19, main:22 and main:23); 

— inner_loop (paths main:9 and main:13); 

— print_string (paths main:2 and main:3); 

— output_chars (paths main:l and main:7) and translate_cind_put_ch.ar 
(path main:5); 

— skip_n_strings (path main:4). 

The new program has the exact same output as the old, and all of the per- 
formance disadvantages as well. To show that we have (in some sense) captured 
the essence of the original program, we path profiled the new program. The path 
profile of the new program is shown in Table 3, with paths sorted in ascending 
order of frequency; it is very similar to the original profile (Table 2) with some 
minor differences due to the restructuring. 



Summary A well known folk theorem in computer science is that any program 
can be transformed into a semantically equivalent program consisting of a single 
recursive function. This is what makes the obfuscated ”12 Days of Christmas” 
program most difhcult to understand. The first parameter to the function main 
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#include <stdio.h> 

static char ^strings = LARGE_STRING; /* the original set of strings */ 

static char ^translate = SMALL_STRING; /* the translation mapping */ 

#define FIRST.DAY 1 
#define LAST.DAY 12 

/* the original "indices" of the various strings */ 

enum { ON.THE = 0, FIRST = -1, TWELFTH = -12, DAY_OF_CHRISTMAS = -13, 
TWELVE_DRUMMERS_DRUMMING = -14, PARTRIDGE_IN_A_PEAR_TREE = -25 

}; 

char* skip_n_strings (int n,char *s) { /* skip -n strings (separator is /) , */ 
if (n == 0) return s; /* where n is a negative value */ 

if (*s==VD return skip_n_strings (n+1 , s+1) ; 
else return skip_n_strings (n, s+1) ; 

} 

/* find the character in the trcuislation buffer 
matching c and output the translation */ 
void translate_and_put_char (chcLT c, char *trans) { 
if (c == *trans) putchar (trans [31] ) ; 
else translate_and_put_char(c,trans+l) ; 

} 

void output, chars (char *s) { 
if (*s -- VD return; 
translate_cLnd_put_char(*s , translate) ; 
output_chars (s+1) ; 

} 

/* skip to the "n'^th" string and print it */ 

void print_string(int n) { output_chars(skip_n_strings(n, strings)) ; } 

/* print the list of gifts */ 

void inner_loop(int count_day, int current_day) { 

if (count_day < current_day) inner_loop(count_day+l,current_day) ; 
print_string(PARTRIDGE_IN_A_PEAR_TREE+(count_day-l) ) ; 

} 

void outer_loop(int current_day) { 

print_string(ON_THE) ; /* "On the " */ 

print_string(-current_day) ; /* ordinal, ranges from -1 to -12 */ 

print_string(DAY_OF_CHRISTMAS) ; /* "day of Christmas ..." */ 

inner_loop(FIRST_DAY, current_day) ; /* print the list of gifts */ 
if (current_day < LAST_DAY) 
outer_loop (current_day+l) ; 

} 

void mainO -[ outer_loop(FIRST_DAY) ; ]■ 

Fig. 3. The restructured “The Twelve Days of Christmas” program. 
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Path ID 


Frequency 


Path ID 


Frequency 


main:0 


1 


skip _n_strings :0 


114 


outer_loop:0 


1 


skip _n -Strings : 2 


1898 


outer_loop:l 


11 


output_chars:0 


2358 


inner _loop:0 


12 


translate_and_put_char:2 


2358 


inner _loop:l 


66 


skip _n -Strings : 1 


23033 


output_chars:l 


114 


translate-and-put-char:0 


39652 


print_string:0 


114 







Table 3. The path profile of the restructured program. 



takes on the role of the program counter and parameters are overloaded to have 
different interpretations depending on the context they are used. 

We used FSA to help separate out the set of functions that this single function 
implements. Thus small case study illustrates the essential features of FSA: 

— The use of low versus high frequencies to partition the program by levels of 
abstraction (for example, the printing of verses as compared to scanning of 
strings); 

— The use of frequency clusters to identify related computations in the program 
(for example, the paths comprising the outer and inner loops); 

— The use of specific frequencies to find computations related to the program’s 
observed behavior (for example, the paths responsible for printing a sub- 
string or a character). 

Our analysis clearly leaves many questions unanswered. Although complex, 
the obfuscated C program was quite small. How will FSA scale to larger pro- 
grams with accompanying larger profiles? There are a number of issues here. 
With the obfuscated C program, there was a rather direct relationship between 
attributes of the program’s output and the program’s behavior. With larger pro- 
grams containing complex intermediate computations, we cannot hope to find 
such direct relationships. The size of the profile is also an issue, as there will 
generally be a lot of ’’noisy” data surrounding the data that one is interested 
in. We feel that the three basic observations of FSA (low vs. high frequency, 
frequency clusters, and special frequencies) will continue to be useful for larger 
programs, but only experience will show how. 

Another shortcoming of our case study was that the obfuscated C program 
had no inputs. The appearance of the same frequency correlations across dif- 
ferent executions (even if absolute frequency values are different) would provide 
stronger evidence of semantic relationships between parts of a program. In the 
next section, we discuss an approach to help analyze multiple execution profiles 
and compare the relationships in program executions to their static counterparts 
in program source text. 





The Concept of Dynamic Analysis 227 



3 Coverage Concept Analysis 

The previous section demonstrated how analysis of the frequency spectrum of a 
single program execution can help in understanding and decomposing a program. 
What can be done if there are many executions to be examined? This section 
considers this question for a restricted but very commonly used type of profile, 
the coverage prohle, which records for each test run, the entities that executed 
(but not their frequencies). 

The main result of this section is to show that concept analysis applied to 
coverage profiles naturally computes dynamic analogs to static control flow re- 
lationships such as domination and regions, identifying ‘dynamic control flow 
invariants” across a set of executions. Additionally, the comparison of the dy- 
namically invariant control flow relationships to their static counterparts can 
point to areas of code requiring more testing and can aid programmers in un- 
derstanding how their code and test sets relate to one another. 



3.1 Concept Analysis and Test Coverage 

Concept analysis is a technique for identifying groups of objects that have com- 
mon attributes [10]. The input to concept analysis is a binary relation between 
objects and attributes. This relation can be represented as a boolean-valued ta- 
ble in which rows represent objects and columns represent attributes. An entry 
of the table is true if an object has an attribute and false otherwise. 

For our purposes, the objects (rows) are tests and the attributes (columns) 
arc the program entities that a test may cover, such as the procedures, state- 
ments, branches or paths of the program. Figure 4(a) shows an example of a test 
coverage table that can be input to concept analysis. The table shows procedure- 
level coverage of five tests (tl through t5) of an implementation of a red-black 
tree data structure (a form of balanced binary tree). The procedure names have 
been shortened to make the table more compact. 

In the testing domain, the pair (T,E), where T is a set of tests and E a 
set of program entities, is a concept if every test in T covers all the entities in 
E, and no test outside of T covers all the entities in E. Equivalently, (T, E) 
is a concept if every entity in E is covered by every test in T and there is no 
entity outside of E covered by every test in T . Stated yet another way, concepts 
determine maximal sets of tests covering identical entities (and maximal sets of 
entities covered by identical tests). Concepts can be computed by a variety of 
algorithms [12,18]. In the worst-case, for a table of size n rows by n columns, 
there may be 2" concepts, so the worst-case running time of any batch algorithm 
that computes all concepts is exponential in n. In practice, concept lattices have 
O(n^) concepts and sometimes even 0{n) concepts [18]. 

The table in Figure 4(a) gives rise to six concepts, shown in Figure 4(b). The 
concept c4 has the tests { t2,t3,t4,t5 }, which have the procedures { add, rem, 
IRotate } in common. Furthermore, this set of procedures has exactly the tests 
{ t2,t3,t4,t5 } in common. The pair ({ tl, t2 }, { add, rem, DelFix }) is not a 
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1 Procedures | 


Test 


add 


IRotate 


rem 


Min 


Succ 


DelFix 


tl 


X 




X 






X 


t2 


X 


X 


X 






X 


t3 


X 


X 


X 


X 


X 




t4 


X 


X 


X 


X 


X 


X 


t5 


X 


X 


X 


X 


X 


X 



(a) 



Concept 


Tests 


Procedures 


cl 


t4, t5 


add, IRotate, rem, Min, Succ, DelFix 


c2 


t3, t4, t5 


add, IRotate, rem, Min, Succ 


c3 


t2, t4, t5 


add, IRotate, rem, DelFix 


c4 


t2, t3, t4, t5 


add, IRotate, rem 


c5 


tl, t2, t4, t5 


add, rem, DelFix 


c6 


tl, t2, t3, t4, t5 


add, rem 



(b) 




(c) (d) 



Fig. 4. (a) Partial procedure coverage from five tests of a red-black tree implemen- 
tation; (b) The six concepts this coverage information induces; (c) Concept lattice of 
with full labelling of tests and procedures; (d) Concept lattice with minimal labelling. 
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concept because the set { tl, t2 } is not the maximal set of tests with common 
entities { add, rem, DelFix } (as concept c5 illustrates). 

Concepts can be ordered by set inclusion on tests or entities. The set of all 
concepts forms a complete partial order (C), given by: 

{Ti,Ei) C {T2,E2) Tl C T 2 <1=^ E 2 C El 

This partial order is also referred to as the concept lattice. Figure 4(c) shows 
the concept lattice for the six concepts cl through c6, with one node for each 
concept. If c T d (and there is no concept c such that c Q c Q d) then there is 
an arrow c — > d in the lattice. Each concept is labelled with its associated set of 
tests and set of entities. 

There are a number of important properties of the concept lattice: 

— If a test t is in a concept c then it is in any concept greater than c (higher 
in the lattice). Furthermore, if an entity e is in a concept c, then it is in 
any lesser concept (lower in the lattice). In the running example, test t3 is 
in concept c2 and so is also in concepts c4 and c6. Procedure IRotate is in 
concept c4, so it is also in concepts cl, c2, and c3. 

— For every test t, there is a unique least concept in which it appears, denoted 
by lcon{t). Similarly, for every entity e, there is a unique greatest concept in 
which it appears, denoted by gcon{e). Concept c2 is the least concept con- 
taining test t3. Similarly, c4 is the greatest concept containing the procedure 
IRotate. 

Figure 4(d) shows how the concept lattice can be labelled so that each test 
and entity appears exactly once. A concept c is labelled with a test t if and 
only if c = lcon{t). Likewise, a concept c is labelled with an entity e if and only 
if c = gcon{e). From now on, the term concept lattice is used to refer to the 
concept lattice labelled in this fashion. All the information in the input table 
can be recovered from this concept lattice. 



3.2 Concepts and Control Flow Invariance 

This section shows how concept analysis of the tcst-vs-cntitics table provides 
dynamic analogs to static control flow relationships such as domination, post- 
domination and regions. Concept analysis of tests-vs-entities identifies “dynamic 
control flow invariants” between entities over a set of tests. These “invariants” 
are dynamic because they not guaranteed to hold for all executions, but do hold 
for the set of observed executions (tests). The comparison of the dynamic and 
static control flow invariants in a program can be used to help develop new tests. 



Domination, Postdomination, and Control Flow Implication Domina- 
tion and postdomination are binary relations over the control flow entities of a 
program that identify when the execution of one entity implies the execution 
of another. Consider control flow entities e and /. Entity e is said to dominate 
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entity / if every path from program entry to / includes e. Entity / is said to 
postdominate entity e if every path from e to program exit includes entity /. 

If entity / dominates entity e then any test that eovers e must also cover /. 
If / postdominates e then it is also the case that any test that covers e must also 
cover /. Thus, execution of entity e (statically) implies the execution of entity 
/ if / dominates e or / postdominates e. 

The partial ordering of concepts in the concept lattice provides the execution- 
time equivalent of control flow implication. If entity / is in a concept greater than 
or equal to gcon(e) then the execution of e dynamically implies the execution 
of /. That is, whenever a test t covers e it also covers /. For example, consider 
the procedure Min, which labels concept c2 in Figure 4(d). Concept c4, which 
contains procedure IRotate, is greater than c2, so any test that covers Min also 
covers IRotate. However, as the lattice also shows, there is a test (t2) in which 
IRotate executes but Min does not. 

Regions If the execution of entity e implies the execution of / and the execution 
of entity / implies the execution of e then e and / are said to occupy the same 
control flow region. That is, there is no test that can separate the execution of 
e from /. The entities either execute together or not at all. Regions partition 
the set of control flow entities in a program using the static domination and 
postdomination relations defined above. More precisely, entities e and / are in 
the same region if e dominates / and / postdominates e.^ 

As with control flow implication, the concept lattice also identifies entities 
that always execute together in a set of tests. If gcon{e) = gcon{f) then e and / 
always execute together in the set of given tests. That is, they are in the same 
dynamic region. For example, in the concept lattice in Figure 4(d), procedures 
Min and Succ have the same greatest concept (c2), and thus always execute 
together. Also, the procedures add and rem share the concept c6. No other two 
procedures occupy a dynamic region. 

Comparing Dynamic and Static Information This section shows how the 
comparison of the static and dynamic control flow relations defined in the pre- 
vious sections can be a useful aid in the development of new tests. 

Suppose a program has been run on a set of tests and there is a pair of 
elements e and / such that e dynamically implies the execution of /, yet e does 
not statically imply /’s execution. Or suppose that gcon{e) = gcon{f), yet e 
and / are in different static regions. There may be a test that covers entity e 
but docs not cover entity /. On the other hand if the execution of e statically 
implies the execution of / or e and / occupy the same static region, there is no 
point in trying to find a test that covers e but does not cover /. In the example 
of Figure 4(d), the procedures Min and Succ always execute together. However, 
these procedures are in different static regions in the red-black tree program. In 
fact, there is a test that separates their execution. 

® This is a particular type of region known as weak regions.[2\ Strong regions identify 
code that will always be executed the same number of times. 
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This example shows how concept analysis provides an intermediate point 
between “entity-based” and “path-based” coverage criteria. Entity-based cover- 
age criteria such a statement or branch coverage consider coverage of entities 
in isolation. Path-based coverage criteria consider sequences of entities (and so 
subsumes entity-based criteria), but infeasible paths greatly complicate deter- 
mining what a sufficient level of coverage is. Concept analysis identifies entities 
that always execute together in a given set of tests (or whose execution implies 
the execution of other entities). By comparing this “set-based” coverage infor- 
mation to the static regions of a program, a programmer can determine those 
entities whose execution they might try to separate. 



4 Related Work 

Recent work on dynamic discovery of program invariants is closely related to 
our work [11]. Ersnt et al.’s work instruments program to record the values that 
variables take on in one or more executions. This information is input into an 
“invariant detection engine” the checks for a number of invariants, such as that 
a variable has a constant value or takes on a small number of values; or that a 
variables value is bounded by some range, etc. Restated, they discover logical 
invariants over a set of program executions, where the types of logical invariants 
that can be identified is another input to the analysis engine. 

Frequency spectrum analysis identifies control flow invariants within an ex- 
ecution (such as that two entities execute the same number of times), while 
concept concept analysis of tost coverage information identifies control flow in- 
variants in a set of program executions. Some control flow invariants may imply 
some of the invariants that Ernst et al’s machinery discovers, and vice versa. For 
example, if a control flow branch based on “x == 2” always evaluates true, then 
the control flow information implies that the variable x always has the value 2 
at that point in the program. Value invariance and control flow invariance and 
techniques used to discover them are thus quite complementary. 

Other work on using dynamic analysis for exploring program executions con- 
centrates on “dynamic differencing” [20,15]. The idea is very simple. Each exe- 
cution of a program generates a different “profile spectrum” , a different set of 
entities that are covered. This set is, of course, dependent on the input that a 
program reads and its interactions with the environment. By carefully controlling 
the inputs to a program and/or the environment in which it executes, perturb- 
ing these slightly and observing the differences in the sets of covered entities, 
one can determine which parts of the code are affected by the perturbations. 
Wilde proposed this technique as a way to determine which code in a telephone 
call processing system is responsible for different call features (such as Caller ID, 
Call Waiting, etc). In this case, different call scenarios would be used to generate 
the different profile spectrums, but with slight modifications to the set of calling 
features that were enabled. Reps et ah showed how dynamic differencing could 
be used to find code that is dependent on dates by simply changing those parts 
of the input to a program related to dates (i.e., years). Both Wilde’s and Reps et 
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al.’s techniques are based mainly on program coverage. Reps et al. also proposed 
using frequency information (counts rather than coverage) to refine the analysis. 

Concept analysis of test-vs-coverage information is dynamic differencing of 
test coverage taken to the extreme. Concept analysis provides a full factoring 
of the coverage information that exposes not only the differences between tests, 
but what they have in common as well. In addition, it computes a number of 
useful relations, such as control flow correlation and tost subsumption, in a single 
framework. 

Much research has been done on applying concept analysis to aide in the 
understanding and restructuring of programs [18,16,17]. All such work that we 
are aware of applies the concept analysis machinery to static relationships in 
a program, such a “procedure P uses variable V” , “class D inherits from class 
C” , etc. Our work makes use of the same machinery, but applies it to the dy- 
namic relationship of “test T covers entity E” in order to help understanding 
the execution behavior of programs across a set of tests. 

A general idea behind frequency spectrum analysis is to use the dynamic 
behavior of programs to help construct models of their behavior. This basic idea 
has been explored in many related settings. For example, in the area of formal 
methods, many techniques for finite state machine synthesis have been proposed 
for constructing finite state models from a set of traces of observed program 
behavior [7]. Cook and Wolfe used such techniques for reverse engineering soft- 
ware processes [8] and later used related techniques to develop models from the 
traces of multi-process programs [9]. In the arena of object-oriented programs, 
a number of efforts have explored how to bridge the gap between programmer 
models of 00 behavior and what happens in 00 program execution [13]. These 
efforts typically instrument an 00 program to record message sends and other 
information, and then use a GUI to help programmers understand the traces 
and build models from them. 



5 Conclusions 



We have shown how frequency spectrum analysis and concept analysis of pro- 
gram profiles can aid in the tasks of program comprehension, program restructur- 
ing, and new test development. Just as program databases about static program 
structure have aided programmers and testers in their jobs, databases of dy- 
namic program behavior gathered over the history of a program should provide 
valuable the software production cycle. The questions of what dynamic data can 
be collected and stored and what tasks this data and analysis of it can support 
arc matters for future investigation. 
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Appendix 

On the first day of Christmas my true love gave to me 
a partridge in a pear tree. 

On the second day of Christmas my true love gave to me 

two turtle doves 

and a partridge in a pear tree. 



On the twelfth day of Christmas my true love gave to me 
twelve drummers drumming, eleven pipers piping, ten lords a-leaping, 
nine ladies dancing, eight maids a-milking, seven swans a-swimming, 
six geese a-laying, five gold rings; 

four calling birds, three french hens, two turtle doves 
and a partridge in a pear tree. 



Fig. 5. Partial output of the obfuscated C program. 
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Abstract. Traditional interprocedural data-flow analysis is performed 
on whole programs; however, such whole-program analysis is not feasible 
for large or incomplete programs. We propose fragment data-flow analy- 
sis as an alternative approach which computes data-flow information for 
a specific program fragment. The analysis is parameterized by the addi- 
tional information available about the rest of the program. We describe 
two frameworks for interprocedural flow-sensitive fragment analysis, the 
relationship between fragment analysis and whole-program analysis, and 
the requirements ensuring fragment analysis safety and feasibility. We 
propose an application of fragment analysis as a second analysis phase 
after an inexpensive flow-insensitive whole-program analysis, in order 
to obtain better information for important program fragments. We also 
describe the design of two fragment analyses derived from an already 
existing whole-program flow- and context-sensitive pointer alias analysis 
for C programs and present empirical evaluation of their cost and pre- 
cision. Our experiments show evidence of dramatically better precision 
obtainable at a practical cost. 



1 Introduction 

Many phases of the software development cycle require information about the 
properties of large and complex programs. Data-flow analysis extracts semantic 
information which can be used for code optimization, program slicing, semantic 
change analysis, program restructuring, and code testing. In many cases, in- 
terprocedural data-flow analysis is needed to obtain information about program 
properties that depend on the interaction between different procedures. Flow- 
insensitive analysis ignores the ordering of statements and computes one solution 
for the whole program; in contrast, flow-sensitive analysis follows the control flow 
order of statements and computes different solutions at distinct program points. 
Context-sensitive analysis considers (sometimes approximately) only paths along 
which calls and returns are properly matched, while context-insensitive analysis 
does not make this distinction. 

Traditionally, interprocedural data-flow analysis is designed to analyze whole 
programs; however, in many cases such whole-program analysis is infeasible. For 
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very large programs with hundreds of thousands or even millions lines of code, 
the time required to build a whole-program representation and the space needed 
to store it arc prohibitive [1]. In many eases, the programs are incomplete — the 
source code for parts of the program (e.g., libraries) is not available. Empirical 
evidence suggests that precise interprocedural flow-sensitive analysis does not 
scale for large programs [18]. In some cases the analysis results are not needed 
for the whole program, but only for a relatively small part of it — for example, 
a maintenance task for a specific program fragment may only require data-flow 
information for program points inside the fragment, both before and after the 
maintenance change. 

This paper proposes an alternative approach for program analysis. Instead of 
addressing the problem of computing data-flow information for the whole pro- 
gram, we address the problem of computing data-flow information for a specific 
program fragment. The problem is parameterized by the additional informa- 
tion available about the rest of the program. Such fragm,ent data-flow analysis 
can avoid the problems of the traditional whole-program analysis. For exam- 
ple, information can be obtained about fragments of very large programs for 
which whole-program analysis is prohibitively expensive. Similarly, analysis can 
be performed on fragments of incomplete programs; for such programs, tradi- 
tional whole-program analysis is not possible. Finally, fragment analysis com- 
putes information about only the “interesting portion” of the program, which 
can be significantly smaller than the program itself. 

This paper is a first step in investigating the theory and practice of fragment 
data-flow analysis. It only considers flow-sensitive fragment analysis. The main 
contributions of this work can be summarized as follows: 

— We describe two frameworks for interprocedural flow-sensitive fragment anal- 
ysis, derived from existing frameworks for whole-program analysis. We dis- 
cuss the relationship between fragment analysis and whole-program analysis, 
and the requirements ensuring fragment analysis safety and feasibility. 

— We propose an application of fragment analysis as a second analysis phase 
after an inexpensive flow-insensitive whole-program analysis, in order to ob- 
tain better information for important program fragments. This approach can 
be used for programs that are too big to be analyzed by flow-sensitive whole- 
program analysis, yet allow flow- insensitive whole-program analysis — for 
example, C programs with around 100,000 lines of code [1,18,17]. 

— We describe the design of two fragment analyses derived from a whole- 
program flow- and context-sensitive pointer alias analysis [10] for C pro- 
grams. 

— We present empirical evaluation of the cost and precision of these two frag- 
ment pointer alias analyses. We show that the time and space costs of the 
analyses are practical. In about 75% of our experiments, the better of the 
two analyses results in a fourfold or higher precision improvement over the 
whole-program flow-insensitive solution. 

The rest of the paper is organized as follows: Section 2 describes frameworks 
for flow-sensitive whole-program analysis. Section 3 discusses frameworks for 
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fragment analysis and the issues involved in their design. Section 4 describes 
the whole-program pointer alias analysis from [10], and Section 5 describes the 
design of two fragment pointer alias analyses. Empirical results are presented in 
Section 6. Section 7 describes related work, and Section 8 presents our conclu- 
sions. 

2 Whole-Program Data-Flow Analysis 

This section presents two well-known frameworks for whole-program interproce- 
dural How-sensitive data-How analysis. Without loss of generality, we will only 
consider analysis for forward data-flow problems [12]. Given a whole program 
to be analyzed, a whole-program analysis constructs a data-flow framework 
<G, L, F, M, T]>, where: 

— G — {N, E, p) is a directed graph with node set IV, edge set E and starting 
node p^N (for our purposes, G is an interprocedural control flow graph). 

— <L, <, A> is a meet semi-lattice [12] with partial order < and meet A. For 
simplicity, we only consider L which is finite^ and has a top element T. 

— FC{/[/:L^L}isa function space closed under composition and 
arbitrary meets. We assume that F is monotone [12]. 

— M ■. N ^ F &TL assignment of transfer functions to the nodes in G (with- 
out loss of generality, we assume no edge transfer functions). The transfer 
function for node n will be denoted by /„. 

— 7]<eL is the solution at the bottom of p. 

The program is represented by an interprocedural control flow graph (ICFG) 
[10], which contains control flow graphs for all procedures in the program. Each 
procedure has a single entry node (node p is the entry node of the starting 
procedure) and a single exit node. Each call statement is represented by a pair 
of nodes, a call node and a return node. There is an edge from the call node to 
the entry node of the called procedure; there is also an edge from the exit node 
of the called procedure to the return node in the calling procedure. 

A path from node ni to node Uk is a sequence of nodes p = (ni, . . . , Uk) such 
that (nj,nj_|_i) G E. Let fp = o o ... o /„j,. A realizable path is a path 
on which every procedure returns to the call site which invoked it [16,10,13]; 
only such paths represent potential sequences of execution steps. A same-level 
realizable path is a realizable path whose first and last nodes belong to the same 
procedure, and on which the number of call nodes is equal to the number of 
return nodes. Such paths represent sequences of execution steps during which 
the call stack may temporarily grow deeper, but never shallower that its original 
depth, before eventually returning to its original depth [13]. The set of all real- 
izable paths from n to rn will be denoted by RP{n. m)] the set of all same-level 
realizable paths from n to m will be denoted by SLRP[n,m). 

Definition 1. For each neN, the meet-over-all-realizable-paths (MORP) 
solution at n is defined as MORP{n) = /\p „) fpiv)- 

^ The results can be easily generalized for finite-height semi-lattices. 
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Context-Insensitive Analysis. After constructing < G, L, F, M,ri >, a whole- 
program analysis computes a solution S : N ^ L; the data-flow solution at the 
bottom of node n will be denoted by 5'„. The solution is safe iff S„ < MORP{n) 
for each node n; a safe analysis computes a safe solution for each valid input 
program. Traditionally, a system of equations is constructed and then solved 
using fixed-point iteration. In the simplest case, a context-insensitive analysis 
constructs a system of equations of the form 

Sp=ri, Sn= /\ fn{Sm) 

m Pred(n) 

where Pred{n) is the set of predecessor nodes for n. The initial solution has 
Sp = rj and 5'n = T for any n ^ p; the final solution is a fixed point of the 
system and is also a safe approximation of the MORP solution. 

Context-Sensitive Analysis. The problem with the above approach is that infor- 
mation is propagated from the exit of a procedure to all of its callers. Context- 
sensitive analysis is used to handle this potential source of imprecision. One 
approach is to propagate elements of L together with tags which approximate 
the calling context of the procedure. At the exit of the procedure, these tags are 
consulted in order to back-propagate information only to call sites at which the 
corresponding calling context existed. 

The “functional approach” to context sensitivity [16] uses a new lattice that 
is the function space of funetions mapping L to L. Intuitively, if the solution at 
node n is a map : L— then for each x^L wc can use /i„(x) to approximate 
fp{x) for each path pG SLRP(e,n), where e is the entry node of the procedure 
containing n. In other words, hn{x) approximates the part of the solution at n 
that occurs under calling context x at e. If calling context x never occurs at e, 
then hn{x) = T. The solution of the original problem is obtained as f\^ ^ hn{x). 

In general, this approach requires compact representation of functions and 
explicit functional compositions and meets, which are usually infeasible. When L 
is finite, a feasible version of the analysis can be designed [16]. Figure 1 presents a 
simplified description of this version. H\n,x\ contains the current value of hn{x)\ 
the worklist contains pairs {n,x) for which H[n,x] has changed and has to be 
propagated to the successors of n. If n is a call node, at line 7 the value of H[n,x] 
is propagated to the entry node of the called procedure. If n is an exit node, the 
value of H[n,x\ is propagated only to return nodes at whose corresponding call 
nodes x occurs (lines 8-10). 

For distributive frameworks [12], this algorithm terminates with the MORP 
solution; for non-distributive monotone frameworks, it produces a safe approxi- 
mation of the MORP solution [16]. When the lattice is the power set of some basic 
finite set D of data-flow facts (e.g., the set all potential aliases or the set of all 
variable definitions), the algorithm can be modified to propagate elements of D 
instead of elements of 2^. For distributive frameworks, this approach produces 
a precise solution [13]. For non-distributive monotone frameworks, restricting 
the context to a singleton set necessarily introduces some approximation; the 
whole-program pointer alias analysis from [10] falls in this category. 
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input <G, L, F, M, r]>\ L is finite 

output S'. array[A] of L 

declare if: array[A,L] of L; initial values T 

W'. list of (n,x), uEN, xEL; initially empty 

[ 1 ] ■= V, W := 

[2] while fF / 0 do 

[3] remove {n,x) from W; y:=H[n,x]', 

[4] if n is not a call node or an exit node then 

[5] foreach mESucc{n) do propagate(m,x,/^(y)); 

[6] if n is a call node then 

[7] e := called_entry(n); propagate(e,y,/e(y)); 

[8] if n is an exit node then 

[9] foreach r£Succ{n) and IeL do 

[10] if if [call_node(r),Z] = x then propagate(r,Z,/r.(y)); 

[11] foreach n 6 AT do 

[12] S[n] := A, ,, H[n,iy, 

[13] procedure propagate(n,x,y) 

[14] H[n,x] := H[n,x] A y, if H[n,x] changed then add {n,x) to W; 



Fig. 1. Worklist implementation of context-sensitive whole-program analysis 



3 Fragment Data-Flow Analysis 

This section describes how interprocedural flow-sensitive whole-program analysis 
can be modified to obtain fragment data-flow analysis, which analyzes a program 
fragment instead of a whole program. The structure of context-insensitive and 
context-sensitive fragment analysis is discussed, as well as the issues involved in 
the design of fragment analysis and the requirements ensuring its safety. 

3.1 Fragment Analysis Structure 

We assume that the analysis input includes a program fragment F , which is an 
arbitrary set of procedures. We expect these procedures to be strongly interre- 
lated; otherwise, the analysis may yield information that is too imprecise. The 
input also contains whole-program information I, which represents the knowl- 
edge available about the programs to which F belongs. The whole-program in- 
formation depends on the particular software development environment and the 
process in which fragment analysis is used; the role of I is further discussed in 
Sect. 3.2. We will use Vi{F) to denote the set of all valid whole programs that 
contain F and for which I is true; depending on /, Vi{F) can be anything from 
a singleton set (e.g., when the source code of the whole program is available) to 
an infinite set. 

Given F and I , a fragment analysis extracts several kinds of information, 
shown in Table 1. Graph G = {N , E ) is the ICFG for the fragment and can be 
constructed similarly to the whole-program case, except for calls to procedures 
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outside of F, which are not represented by any edges in G . Set Boundary Entries 
contains every entry node e£N which has a predecessor c<^N in some program 
from Vi{F). Similarly, Boundary Calls contains every call node c^N which has 
a successor e^N in some program from Vi{F). Set Boundary Returns contains 
the return nodes corresponding to call nodes from Boundary Calls. 



Table 1. Information extracted by a fragment analysis 



Information 


Description 


<G , A , F , M , »7 > 


Data-flow framework 


Boundary Entries 


Entry nodes of procedures called from outside of F 


BoundaryCalls 


Call nodes to procedures outside of F 


Boundary Returns 


Return nodes from procedures outside of F 


Pe '■ Boundary Entries — > L 


Summary information at boundary entry nodes 


Pc '■ BoundaryCalls ^ F 


Summary information at boundary call nodes 



In the data-flow framework, L is a finite meet semi-lattice with partial or- 
der <, meet A and a top element T . F C{/|/:L— >L}isa monotone 
function space closed under composition and arbitrary meets. M is an assign- 
ment of transfer functions to the nodes in G ; the transfer function for node n 
will be denoted by /„. Value rj e L is the solution at the bottom of the entry 
node of the program; it is needed only if F contains the starting procedure of 
the program. 

In order to summarize the effects of the rest of the program on F, a frag- 
ment analysis constructs two maps. Map pe ■ B oundary Entries ^ L assigns to 
each boundary entry node e a value which approximates the part of the solu- 
tion at e that is due to realizable paths that reach e from outside of F. Map 
Pc '■ BoundaryCalls — > F assigns to each boundary call node a function which 
summarizes the effects of all same-level realizable paths from the entry to the 
exit of the called procedure. Both maps are discussed in Sect. 3.3. 

Based on the extracted information from Table 1, a context-insensitive frag- 
ment analysis solves the system of equations shown in Fig. 2. A context-sensitive 
fragment analysis can be constructed as in Sect. 2; a worklist implementation 
is shown in Fig. 3. This algorithm is a modified version of the one presented in 
Fig. 1, with new or modified lines labeled with asterisks. 

3.2 Fragment Analysis Design 

Designers of a specific fragment analysis have to address several important prob- 
lems. One problem is to decide what kind of whole-program information I to 
use. The decision depends on the software development environment, as well as 
the process in which the fragment analysis is used. In this paper, we are particu- 
larly interested in a process where an inexpensive flow-insensitive whole-program 
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= Am Predin) fni^m) if n ^ Boundary Entries U Boundary Returns 

Sn = Pe{n) A /\^ Predin) friiSm) if n € Boundary Entries 

S„ = fn{Pc{'m){S^)) if ne Boundary Returns and m = calLnode(n) 

S = r) if peN 



Fig. 2. Context-insensitive fragment analysis 

input <G , L , F , M ,r) >■, L is finite 

output S: array [A ] of L 

declare H: array [A ,L ] of L ; initial values T 

W: list of (n,x), nEN , xEL ; initially empty 
[la*] if peA then 

[lb] H[p,r] ] := T) ■, add {p,q ) to fF; 

[Ic*] foreach n £ Boundary Entries do 

[Id*] H[n,Pe(n)] := /?e(u); add (n,/?e(n)) to W; 

[2] while IF 7^ 0 do 

[3] remove (n,x) from IF; y:=ff[n,x]; 

[4] if n is not a call node or an exit node then 

[5] foreach mE Succ{n) do propagate(m,x,/m(y)); 

[6a*] if n is a call node and n ^ Boundary Calls then 

[7a] e := called_entry(n); propagate(e,y,/e(y)); 

[6b*] if n is a call node and n E Boundary Calls then 

[7b*] r := ret_node(n); propagate(r,x,/r(/3c(n)(y))); 

[8] if n is an exit node then 

[9] foreach rESucc(n) and I EL do 

[10] if A [calLnode(r),Z ] = x then propagate(r,Z ,fr{y))', 

[11] foreach nEN do 

[12] S[n] := A, ^ H[n,l ]; 



Fig. 3. Worklist implementation of context-sensitive fragment analysis 



analysis is performed first, and then a more precise flow-sensitive fragment anal- 
ysis is used on fragments for which better information is needed. This approach 
can be used for programs that are too big to be analyzed by flow-sensitive whole- 
program analysis, yet allow flow-insensitive whole-program analysis — for exam- 
ple, C programs with around 100,000 lines of code [1,18,17], In this scenario, the 
first stage computes a whole-program flow-insensitive solution and the program 
call graph. The second stage uses the call graph and the flow-inscnsitivc solution 
as its whole-program information I. The two fragment analyses in Sect. 5 are 
designed in this manner. 

Another problem is to construct the information described in Table 1. Sets 
BoundaryCalls and Boundary Returns can be determined from F. Set Bound- 
aryEntries by definition depends on /. The summary information at boundary 
nodes also depends on I . When the fragment analysis follows a flow-insensitive 
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whole-program analysis, I contains the call graph of the program, from which 
the boundary entries can be easily determined. In this case, j3e and /3c can be 
extracted from the whole-program flow-insonsitivc solution, as shown in Sect. 5. 

The semi-lattice L depends mostly on F. However, it may be also dependent 
on 7 — for example, when the fragment analysis follows a flow-insensitive whole- 
program analysis, I contains a whole-program solution which may need to be 
represented (possibly approximately) using L . The fragment analysis complexity 
is bounded by a function of the size of L ; therefore, it is crucial to ensure 
that the size of L depends only on the size of F and not the size of 7. This 
requirement guarantees that the fragment analysis will be feasible for relatively 
small fragments of very large programs. The fragment analyses from Sect. 5 
illustrate this situation. 

3.3 Fragment Analysis Safety 

The fragment analyses outlined above are similar in structTire to the whole- 
program analyses from Sect. 2. In fact, we are only interested in fragment analy- 
ses that are derived from whole-program analyses. Consider a safe whole-program 
analysis which we want to modify in order to obtain a safe fragment analysis. The 
most important problem is to dehne the relationship between the semi-lattice 
L for a fragment F and the whole-program semi-lattices Lp for the programs 
p&Vi{F). For each such Lp, the designers of the fragment analysis must define 
an abstraction relation apCLp x L . This relation encodes the notion of safety 
and is used to prove that the analysis is safe; it is never explicitly constructed 
or used during the analysis. If {x,x )Gofp, we will write “ap{x,x )”. 

Intuitively, the abstraction relation (Xp defines the relationship between the 
“knowledge” represented by values from Lp and the “knowledge” represented by 
values from L .liap{x,x), the knowledge associated with x “safely abstracts” 
the knowledge associated with a;; thus, ap is similar in nature to the abstraction 
relations used for abstract interpretation [19,6]. The choice of ap depends both 
on the original whole-program analysis and the intended clients of the fragment 
analysis solution. Sect. 5 presents an example of one such choice. 

Definition 2. A solution produced by a fragment analysis for an input pair 
(F, I) is safe iff ap[MORPp(n),S^) for each p^Vi{F) a,nd, each n^N , where 
S„ is the fragment analysis solution at n and MORPp{n) is the MORP solution 
at n in p. A safe fragment analysis yields a safe solution for each valid input 
pair {F, I). 

With this definition in mind, we present a set of sufficient requirements that 
ensures the safety of the fragment analysis. Intuitively, the first requirement 
ensures that a safe approximation in L , as defined by the partial order <, is 
also a safe abstraction according to Op. The second requirement ensures that if 
x &L safely abstracts the effects of each realizable path ending at a given node 
n, it can be used to safely abstract the MORP solution at n. Formally, for any 
p G Vi{F) and its ap, any x,y ^ Lp and any x ,y G T , the following must be 
true: 
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Property 1: if ap{x,x ) and y <x , then ap{x,y ) 

Property 2: if ap{x, x ) and Op(t/, x ), then ap{x A y,x ) 

The next requirement ensures that the transfer functions in the fragment 
analysis safely abstract the transfer functions in the whole-program analysis. 
Let fn,p be the transfer function for n G in the whole-program analysis for 
any peVi{F). For any n£N , x&Lp and x eL , the following must be true: 

Property 3: if ap{x,x ), then ap{fn,p{x), f„{x )) 

If F contains the starting procedure of the program, then p £ N and the 
fragment analysis solution at the bottom of p is y . Let Pp be the whole-program 
solution at the bottom of p for any p&Vi{F). Then the following must be true: 

Property 4: ap{r]p,ri ) 

The next requirement ensures that for each boundary entry node e, the sum- 
mary value /?e(e) safely abstracts the effects of each realizable path that reaches 
e from outside of F. For any pe'P/(F’), let RP°^^{p, e) be the set of all realizable 
paths q = (p, . . . , c, e) in p such that c^N ; recall that fq is the composition of 
the transfer functions for the nodes in q. Then the following must be true: 

Property 5: VgG e) : ap{fq{rip), fie{e)) 

The last requirement ensures that the summary function /?c(c) for each 
boundary call node c safely abstracts the effects of each same-level realizable 
path from the entry to the exit of the called procedure. Consider any p£Vi{F) 
in which c has a successor entry node N and let t be the exit node corre- 
sponding to e. Let SLRPp{e,t) be the set of all same-level realizable paths in 
p from e to t. Intuitively, each realizable path ending at t has a suffix which is 
a same-level realizable path; thus, /?c(c) should safely abstract each path from 
SLRPp{e,t). Formally, for any x&Lp and x & L , the following must be true: 

Property 6: \/q£ SLRPp{e,t): if ap{x,x ), then ap{fq{x), /3c{c){x )) 

If the above requirements arc satisfied, the context-insensitive fragment anal- 
ysis derived from a safe context-insensitive whole-program analysis is safe, ac- 
cording to Definition 2. The proof considers a fixed-point solution of the system 
in Fig. 2. It can be shown that for each p G Vj(F), each n A N and each re- 
alizable path q = (pp, . . . ,n), it is true that ap(fq[r]p), S„). Each such q is the 
concatenation of two realizable paths r and r . Path r starts from pp and lies 
entirely outside of T; if pp G , r is empty. Path r = (e, . . . , n) has e&N and 
n&N and is the fragment suffix of q; it may enter and leave F arbitrarily. The 
proof is by induction on the length of the fragment suffix of q. 

Similarly, the context-sensitive fragment analysis derived from a safe context- 
sensitive whole-program analysis is safe. It is enough to show that for each 
p G Vi{F), each net N and each realizable path q = (pp, . . . , n), there exists a 
value I G L such that Oip{fq{rjp), H[n, Z ]). Again, this can be proven by induction 
on the length of the fragment suffix of q. 
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4 Whole-Program Pointer Alias Analysis 

This section presents a simplified high-level description of the Landi-Ryder 
whole-program pointer alias analysis [10] for the C programming language. The 
analysis considers a set of names that can be described by the grammar in Fig. 4. 
Each of the names corresponds to one or more run-time memory locations. An 
alias pair (or simply an alias) is a pair of names that potentially represent the 
same memory location. 



<Name> ;:= <SimpleName> | <Deref> | <ArrayElem> 
I <FieldAccess> | <K-Limited> 



<SimpleName> : 


= 


identifier 


/* 


variable name 


*/ 




1 


heap,.j 


/* 


heap location 


*/ 


<Deref > 




= 


*(<Name>+?) 


/* 


dereference */ 




<ArrayElem> 




= 


<Name> [?] 


/* 


array element 


*/ 


<Fld> 




= 


identifier 


/* 


field name */ 




<FieldAccess> 




= 


<Name> . <Fld> 


/* 


field of a structure */ 


<K-Limited> 




= 


<Name># 


/* 


k-limited name 


*/ 



Fig. 4. Grammar for names 



If a program point n is a call to malloc or a similar function, name heap„ 
represents the set of all heap memory locations created at this point during 
execution. If there are recursive data structures (e.g., linked lists), the number 
of names is potentially infinite. The analysis limits the number of dereferences 
in a name to a given constant k; any name with more that k dereferences is 
represented by a k-limited name. For example, for fc = 1, name (*((*p)./))./ is 
represented by the k-limited name ((*p)./)^. 

A sim,ple nam,e is a name generated by the SimpleName nonterminal in the 
grammar. The root name of a name n is the simple name used when n is gen- 
erated by the grammar; for example, the root name of {*p).f is p. Name n is a 
fixed location if it does not contain any dereferences. Name (*p)./ is not a fixed 
location, while s[?].f is a fixed location. 

The lattice of the analysis is the power set of the set of all pairs of names; the 
meet operator is set union and the partial order is the “is-superset-of” relation. 
The analysis is flow- and context-sensitive, and conceptually follows the Sharir- 
Pnueli algorithm described in Sect. 2. However, since the lattice is a power set, 
the algorithm can be modified to propagate single aliases instead of sets of aliases 
— the worklist contains triples (n,RA,PA), where n is a node, RA is a reaching 
alias that is part of the solution at the entry of the procedure to which n belongs, 
and PA is a possible alias at the bottom of n. Restricting the reaching alias set to 
a single alias introduces some approximation; therefore, the actual Landi-Ryder 
algorithm can be viewed as an approximation algorithm solving the more precise 
Sharir-Pnueli formulation of the problem. 
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The fragment analyses in Sect. 5 are derived from a modified version of the 
Landi-Ryder analysis. This version considers only aliases containing a fixed lo- 
cation; aliases with two non-fixed locations (e.g., an alias between {*p).f and *q) 
are ignored during propagation. It can be shown that this is safe, as long as the 
program does not contain assignments in the form *p=&x; if such assignments 
exist, they can be removed by introducing intermediate temporary variables — 
for example, t=&x; =t=p=t. An assignment whose left-hand side is a non-fixed lo- 
cation (i.e., through-deref assignment) potentially modifies many fixed locations. 
A safe solution for the modified analysis is sufficient to estimate all fixed loca- 
tions possibly modified by through-deref assignments. We refer to this process 
as resolution of through-deref assignments. 

5 Fragment Pointer Alias Analyses 

This section describes two fragment pointer alias analyses — basic analysis Ai 
and extended analysis A 2 — derived from the modified whole-program analysis 
in Sect. 4. They are used for resolution of through-deref assignments. Their 
design is based on a process in which flow-insensitive whole-program pointer 
alias analysis is performed first, and then more precise flow-sensitive fragment 
pointer alias analysis is used on fragments for which better information is needed. 
As described in Sect. 3.2, in this case the whole-program information / contains 
a whole-program flow-insensitive solution and the program call graph. 

Analysis Input. The flow-insensitive solution Spi is obtained using a whole- 
program flow- and context-insensitive analysis similar to the one in [21]. Intu- 
itively, the names in the program are partitioned into equivalence classes; if two 
names can be aliases, they belong to the same class in the final solution. Every 
name starts in its own equivalence class and then classes are joined as possi- 
ble aliases are discovered. For example, statement p=&x causes the equivalence 
classes of *p and x to merge. Overall, the analysis is similar to other flow- and 
context-insensitive analyses with almost-linear cost [17,15]. Then Spi is used to 
resolve calls through function pointers and to produce a safe approximation of 
the program call graph. The sets of boundary entry, call and return nodes can 
be easily determined from this graph. 

The basic analysis A\ takes as input Spi and the program call graph, as 
well as the source code for the analyzed fragment F. The extended analysis 
A 2 requires as additional input the source code for all procedures directly or 
indirectly called by procedures in F . These additional procedures together with 
the procedures from F form the extended fragment Fext ■ Including all transitively 
called procedures allows A 2 to estimate better the effects of calls to procedures 
outside of F; the tradeoffs of this approach are further discussed in Sect. 8. 

Analysis Lattices. The whole-program lattice is based on the set of names gen- 
erated by the grammar in Fig. 4. Each such name can be classified as either 
relevant or irrelevant. A relevant name has a root name with a syntactic occur- 
rence in the fragment; all other names are irrelevant. Since the fragment analysis 
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solution is used to resolve through-deref assignments, only aliases that contain 
a relevant non-fixed location are useful. If the new lattice L contains all such 
aliases, its size still potentially depends on the number of fixed locations in the 
whole program. This problem can be solved by using special placeholder variables 
to represent sets of related irrelevant fixed locations; this approach is similar to 
the use of representative data-flow values in other analyses [11,10,7,20,4]. 

Consider an equivalence class Ei & Spj that contains at least one relevant 
name. If Ei contains irrelevant fixed locations, a placeholder variable phi is 
used to represent them. The aliases from L fall in two categories: (1) a pair 
of relevant names, exactly one of which is a fixed location, and (2) a relevant 
non-fixed location and a placeholder variable. Clearly, the number of relevant 
names and the number of placeholder variables depend only on the size of the 
fragment; thus, the size of L is independent of the size of the whole program. 

In the final solution, an alias with two relevant names represents itself, while 
an alias with a placeholder phi represents a set of aliases, one for each irrelevant 
fixed location represented by phi. This can be formalized by defining an abstrac- 
tion relation ap Lp x L for each pG Vi{F). For each pair of sets S & Lp and 
S G L , we have Qp(5, S ) iff each alias from S is safely abstracted by S . Let x 
be the non-fixed location in alias (x,y) g 5; then ap is defined as follows: 

— If X is an irrelevant name, then (x,y) is safely abstracted by S 

— If X and y are relevant and (x,y) <eS , then (x,p) is safely abstracted by S 

— If X is relevant and y is irrelevant, Ei is j/’s equivalence class, and (x,phi) G S , 

then (x,y) is safely abstracted by S 

Intuitively, the above definition says that aliases involving an irrelevant non- 
fixed location can be safely ignored. It also says that irrelevant fixed locations 
from the same equivalence class have equivalent behavior and need not be con- 
sidered individually. The safety implications of this definition are discussed later. 
Note that if a fragment alias solution is safe according to the above definition, 
it can be used to safely resolve through-deref assignments. 

Summaries at Boundary Nodes. Based on Spi, A\ and A 2 construct a set of 
aliases a<^L . Each equivalence class Ei is examined and for each pair of names 
(x,y) of a non-fixed location x and a fixed location y from the class, the following 
is done: first, if x is an irrelevant name, the pair is ignored. Otherwise, if y is a 
relevant name, (x,y) is added to a. Otherwise, if y is an irrelevant name, (x.phi) 
is added to a. Clearly, (x,y) is safely abstracted by a according to the definition 
of ap given above. 

The basic analysis Ai defines /?e(e) = a for each boundary entry node e and 
and Pc{e)(x) = a for each x G L and each boundary call node c. This essentially 
means that Spi is used to approximate the solutions at boundary entry nodes 
and boundary return nodes. For the extended analysis A 2 , there are no boundary 
call nodes. For each boundary entry node e that belongs to the original fragment 
F, the analysis defines f3e{e) = cr. If e is part of the extended fragment F^xt, but 
is not part of the original fragment F, the analysis defines f3e{e) = 0- 
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Analysis Safety. To show the safety of Ai, it is enough to show that the require- 
ments from Sect. 3.3 are satisfied. Properties 1 and 2 are trivially satisfied. Since 
both r]p and r/ are the empty set, Property 4 is also true. 

Showing that Property 3 is true requires careful examination of the transfer 
functions from [10]. The formal proof is based on two key observations. The first 
one is that aliases involving an irrelevant non-fixed location are only propagated 
through the nodes in the fragment, without actually creating any new aliases; 
therefore, they can be safely ignored. The second observation is that aliases 
with the same non-fixed location and with different irrelevant fixed locations 
have equivalent behavior. For example, alias (*p,x) at the top of statement q=p; 
results in (*p,x) and {*q,x) at the bottom of the statement. Similarly, {*p,y) 
results in {*p,y) and {*q,y). If both x and y are represented by a placeholder 
phi, alias (*p,phi) results in {*p,phi) and (*q,phi), which satisfies Property 3. 

The set a described above is extracted from the whole-program flow-insen- 
sitive alias solution, which is safe. Therefore, a safely abstracts any alias that 
could be true at a node in the program; thus. Properties 5 and 6 are true. Since 
all requirements are satisfied, Ai is safe. Similarly to the whole-program case, 
the actual implementation propagates single aliases instead of sets of aliases. As 
a result, the reaching alias set is restricted to a single alias and therefore some 
approximation is introduced. Of course, the resulting solution is still safe. 

For the extended analysis A 2 , it is not true that the solution is safe at each 
node in the extended fragment. For example, consider the entry node of a pro- 
cedure that was not in the original fragment F. If this procedure is called from 
outside of the extended fragment , aliases could be propagated along this call 
edge during the whole-program analysis, but would be missing in the fragment 
analysis. However, it can be proven that for each node in the original fragment 
F, the solution is safe. The proof is very similar to the one outlined in Sect. 3.3, 
and still requires that Properties 1 through 4 are true. For each realizable path 
q starting at p and ending at a node in F, its fragment suffix is the subpath 
starting at the first node in q that belongs to F. The proof is by induction on 
the length of the fragment suffix; the base case of the induction depends on the 
fact that /?e(e) = cr for boundary entry nodes in F, and therefore the solution 
at such nodes is safe. 

6 Empirical Results 

Our implementation uses the PROLANGS Analysis Framework^ (version 1.0), 
which incorporates the Edison Design Group front end for C/C+ + . The results 
were gathered on a Sun Sparc-20 with 352 MB of memory. The implementation 
analyzes a reduced version of C that excludes signals, setjmp, and longjmp, but 
allows function pointers, type casting and union types. 

Table 2 describes the C programs used in our experiments. It shows the pro- 
gram size in lines of code and number of ICFG nodes, the number of procedures, 
the total number of assignments, and the number of through-deref assignments. 

http: / / www.prolangs.rutgers.edu 
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Table 2. Analyzed Programs 



Program 


LOG 


IGFG 

Nodes 


Procs 


Assignments | 


Program 


LOG 


IGFG 

Nodes 


Procs 


Assignments | 


All 


Deref 


All 


Deref 


zip 


8177 


6172 


122 


3443 


582 


espresso 


14910 


15339 


372 


7822 


1322 


sc 


8530 


6678 


160 


3440 


159 


tsl 


16053 


15469 


471 


7249 


507 


larn 


10014 


12063 


298 


6021 


389 


moria 


25292 


20213 


458 


10557 


1358 



For each of the data programs, wc extracted by hand subsets of related proce- 
dures that formed a cohesive fragment. Significant effort was put in determining 
realistic fragments — the program source code was thoroughly examined, the 
program call graph was used to obtain better understanding of the calling re- 
lationships in the program, and the documentation (both external and inside 
the source code) was carefully analyzed. For each program, two fragments were 
extracted. For example, for zip one of the fragments consisted of all procedures 
implementing the implosion algorithm. For espresso, one of the fragments con- 
sisted of the procedures for reducing the cubes of a boolean function. For sc, 
one of the fragments consisted of the procedures used to evaluate expressions in 
a spreadsheet. Some characteristics of the fragments are given in Table 3. 

For each fragment, three experiments were performed; the results are shown 
in Table 4. In the hrst experiment, the solution from the whole-program flow- 
insensitive (FI) analysis was used at each through-deref assignment to determine 
the number of simple names possibly modified by the assignment. For a Exed 
location that was not a simple name, a modification of its root simple name was 
counted (e.g., a modification of s.f was counted as a modification of s). Then 



Table 3. Analyzed Fragments 



Fragment 


IGFG 

Nodes 


% Total 


Procs 


Boundary 


Assignments! 


Entries 


Galls 


All 


Deref 


zip . 1 


1351 


21.9 


28 


5 


17 


776 


59 


zip . 2 


429 


7.0 


9 


5 


19 


255 


50 


sc . 1 


1238 


18.5 


30 


3 


30 


609 


8 


sc . 2 


793 


11.9 


13 


3 


10 


459 


23 


larn. 1 


345 


2.9 


4 


4 


35 


188 


46 


larn. 2 


420 


3.5 


11 


9 


44 


216 


4 


espresso . 1 


440 


2.9 


6 


2 


40 


234 


38 


espresso . 2 


963 


6.3 


19 


5 


113 


461 


53 


tsl . 1 


355 


2.3 


13 


4 


33 


175 


15 


tsl . 2 


1004 


6.5 


29 


7 


134 


459 


17 


moria. 1 


2678 


13.2 


43 


14 


348 


1634 


392 


moria. 2 


1221 


6.0 


27 


7 


149 


644 


49 
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Table 4. Precision Comparison 



Fragment 


WholePrgm FI 


Basic FFS 


Percent 


Extended FFS 


Percent 


zip . 1 


15.78 


12.98 


82.3 


12.98 


82.3 


zip . 2 


3.88 


1.02 


26.3 


1.02 


26.3 


sc . 1 


9.50 


9.00 


94.7 


1.00 


10.5 


sc . 2 


13.14 


3.29 


25.0 


3.29 


25.0 


larn. 1 


17.00 


14.57 


85.7 


1.00 


5.9 


larn. 2 


9.00 


9.00 


100.0 


1.00 


11.1 


espresso . 1 


107.10 


107.10 


100.0 


34.00 


31.7 


espresso . 2 


57.55 


57.55 


100.0 


35.11 


61.0 


tsl . 1 


41.87 


17.07 


40.8 


1.00 


2.4 


tsl . 2 


41.88 


24.82 


59.3 


1.00 


2.4 


moria. 1 


132.80 


1.46 


1.1 


1.46 


1.1 


moria.2 


31.04 


3.55 


11.4 


3.55 


11.4 



the average across all through-deref assignments in the fragment was taken. In 
the second experiment, the FI analysis was followed by the basic fragment flow- 
sensitive (F'FS) analysis. Again, for each through-deref assignment the number 
of simple names possibly modified was determined; placeholder variables were 
expanded to determine the actual simple names modified. Then the average 
aeross all through-deref assignments was taken. Each of these averages is shown 
as an absolute value and as a percent of the FI average. The third experiment 
used the extended fragment analysis instead of the basic fragment analysis. 

Overall, the results show that the precision of the extended analysis is very 
good. In particular, for seven fragments it produces averages very close to 1, 
which is the lower bound. The averages are bigger than 4 for only three frag- 
ments: all three take as input a pointer to an external data structure, and a 
large number of the through-deref assignments in the fragment are through this 
pointer. The pointer itself is not modified, and each modification through it 
resolves to the same number of simple names as in the FI solution. 

The performance of the basic analysis is less satisfactory. For five fragments, 
it achieves the same precision as the extended analysis. For the remaining frag- 
ments, in five cases the solution is close to the FI solution. The main reason is 
that precision gains from llow-sensitivity are lost at calls to procedures outside 
of the fragment. 

Table 5 shows the running times of the analyses in minutes and seconds. The 
last two columns give the used space in kilobytes. The results show that the cost 
of the fragment analyses is acceptable, in terms of both space and time. 

7 Related Work 

In Harrold and Rothermel’s separate pointer alias analysis [8] , a software module 
is analyzed separately and later linked with other modules. The analysis is based 
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Table 5. Analysis Time and Space 



Fragment 


WholePrgm FI 


Basic FFS 


Extended FFS 


Basic Space 


Extended Space 


zip . 1 


0:02 


1:01 


1:35 


2544 


3784 


zip . 2 


0:02 


0:18 


0:18 


2064 


2064 


sc . 1 


0:02 


0:18 


0:18 


2504 


2504 


sc . 2 


0:02 


0:25 


0:33 


2664 


2824 


larn . 1 


0:03 


0:22 


0:27 


3760 


6560 


larn . 2 


0:03 


0:20 


0:28 


3680 


6556 


espresso . 1 


0:07 


0:52 


1:58 


5504 


17400 


espresso . 2 


0:07 


0:56 


3:07 


5504 


26904 


tsl . 1 


0:08 


0:33 


0:43 


5776 


8672 


tsl . 2 


0:08 


1:15 


1:25 


12480 


18648 


moria. 1 


0:09 


1:22 


1:29 


9504 


10464 


moria.2 


0:09 


1:39 


1:37 


7608 


17440 



on the whole-program analysis from [10]; it simulates the aliasing effects that are 
possible under all calling contexts for the module. Placeholder variables are used 
to represent sets of variables that are not explicitly referenced in the module, 
similarly to the placeholder variables in our fragment pointer alias analyses. 
Aliases are assigned extra tags describing the module calling context. 

There are several differences between our work and [8]. First, we have de- 
signed a general framework for fragment analysis and emphasized the importance 
of the theoretical requirements that ensure analysis safety and feasibility. Sec- 
ond, the intended application of our fragment pointer alias analyses is to improve 
the information about a part of the program after an inexpensive whole-program 
analysis; the application in [8] is separate analysis of single-entry modules. Fi- 
nally, [8] does not present empirical evaluation of the performance of the analysis; 
we believe that their approach may have scalability problems. 

Reference [11] presents an analysis that decomposes the program into regions 
in which several local problems are solved. Representative values are used for 
actual data-flow information that is external to the region; our placeholder vari- 
ables are similar to these representative values. Other similar mechanisms are the 
non-visible names from [10,7], extended parameters from [20], and unknown ini- 
tial values from [4] . We use an abstraction relation to capture the correspondence 
between the representative and the actual data-flow information; this is similar 
to the use of abstraction relations in the field of abstract interpretation [19,6]. 

The work in [3] also addresses the analysis of program fragments and uses 
the notion of representative data-flow information for external data-flow values. 
However, in [3] the specific fragments are libraries with no boundary calls, the 
analysis computes def-use associations in object-oriented languages with excep- 
tions, and there is no assumption of available whole-program information. 

Cardelli [2] considers separate type checking and compilation of program 
fragments. He proposes a theoretical framework in which program fragments are 
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separately compiled in the context of some information about missing fragments, 
and later can be safely linked together. 

Model checking is a technique for verifying properties of finite-state sys- 
tems; the desired properties are specified using temporal-logic formulae. Modular 
model checking verifies properties of system modules, under some assumptions 
about the environment with which the module interacts [9]. These assumptions 
play an analogous role to that of the whole-program information in fragment 
analysis. Further discussion of the relationship between data-flow analysis and 
model checking is given in [14], 

There is some similarity in the problem addressed by our work and that 
in [5], which presents an analysis of modular logic programs for which a com- 
positional semantics is defined. Each module can be analyzed using an abstract 
interpretation of the semantics. The analysis results for separate modules can 
be composed to yield results for the whole program, or alternatively, the results 
for one module can be used during the analysis of another module. 

8 Conclusions 

This paper is a first step in investigating the theory and practice of fragment 
data-flow analysis. It proposes fragment analysis as an alternative to traditional 
whole-program analysis. The theoretical issues involved in the design of safe 
and feasible flow-sensitive fragment analysis are discussed. One possible appli- 
cation of fragment analysis is to be used after a whole-program flow-insensitive 
analysis in order to improve the precision for an interesting portion of the pro- 
gram. This paper presents one such example, in which better information about 
modifications through pointer dereference is obtained by performing flow- and 
context-sensitive fragment pointer alias analysis. The design of two such anal- 
yses is described, and empirical results evaluating their cost and precision are 
presented. 

The empirical results show that the extended analysis presented in Sect. 5 
can achieve significant precision benefits at a practical cost. The performance 
of the basic analysis is less satisfactory, even though in about half of the eases 
it achieves the precision of the extended analysis. Clearly, in some cases the 
whole-program flow-insensitive solution is not a precise enough estimate of the 
effects of calls to external procedures. The extended analysis solves this problem 
for the fragments used in our experiments. We expect this approach to work 
well in cases when the extended fragment is not much bigger than the original 
fragment. One typical example would be a fragment with calls only to relatively 
small external procedures which provide simple services; in our experience, this 
is a common situation. However, in some cases the extended fragment may con- 
tain a prohibitively large part of the program; furthermore, the source code for 
some procedures may not be available. We are currently investigating different 
solutions to the.se problems. 
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Abstract. Imagine some program and a number of changes. If none of these 
changes is applied (“yesterday”), the program works. If all changes are applied 
(“today”), the program does not work. Which change is responsible for the fail- 
ure? We present an efficient algorithm that determines the minimal set of failure- 
inducing changes. Our delta debugging prototype tracked down a single failure- 
inducing change from 178,000 changed GDB lines within a few hours. 



1 A True Story 

The GDB people have done it again. The new release 4.17 of the GNU debugger [6] 
brings several new features, languages, and platforms, but for some reason, it no longer 
integrates properly with my graphical front-end DDD [10]: the arguments specified 
within DDD are not passed to the debugged program. Something has changed within 
GDB such that it no longer works for me. Something? Between the 4.16 and 4.17 re- 
leases, no less than 178,000 lines have changed. How can I isolate the change that 
caused the failure and make GDB work again? 

The GDB example is an instance of the “worked yesterday, not today” problem: 
after applying a set of changes, the program no longer works as it should. In finding the 
cause of the regression, the differences between the old and the new configuration (that 
is, the changes applied) can provide a good starting point. We call this technique delta 
debugging — determining the causes for program behavior by looking at the differences 
(the deltas). 

Delta debugging works the better the smaller the differences are. Unfortunately, 
already one programmer can produce so many changes in a day such that the differences 
are too large for a human to trace — let alone differences between entire releases. In 
general, conventional debugging strategies lead to faster results. 

However, delta debugging becomes an alternative when the differences can be nar- 
rowed down automatically. Ness and Ngo [5] present a method used at Cray research 
for compiler development. Their so-called regression containment is activated when the 
automated regression test fails. The method takes ordered changes from a configuration 
management archive and applies the changes, one after the other, to a configuration 
until its regression test fails. This narrows the search space from a set of changes to a 
single change, which can be isolated temporarily in order to continue development on 
a working configuration. 

O. Nierstrasz, M. Lemoine (Eds.): ESEC/FSE ’99, LNCS 1687, pp. 253-267, 1999. 

© Springer-Verlag Berlin Heidelberg 1999 
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Regression containment is an effective delta debugging technique in some settings, 
including the one at Cray research. But there are several scenarios where linear search 
is not sufficient: 

Interference. There may be not one single change responsible for a failure, but a com- 
bination of several changes: each individual change works fine on its own, but 
applying the entire set causes a failure. This frequently happens when merging the 
products of parallel development — and causes enormous debugging work. 
Inconsistency. In parallel development, there may be inconsistent configurations — 
combinations of changes that do not result in a testable program. Such configu- 
rations must be identified and handled properly. 

Granularity. A single logical change may affect several hundred or even thousand 
lines of code, but only a few lines may be responsible for the failure. Thus, one 
needs facilities to break changes into smaller chunks — a problem which becomes 
evident in the GDB example. 

In this paper, we present automated delta debugging techniques that generalize re- 
gression containment such that interference, inconsistencies, and granularity problems 
are dealt with in an effective and practical manner. In particular, our dd^ algorithm 

- detects arbitrary interferences of changes in linear time 

- detects individual failure-inducing changes in logarithmic time 

- handles inconsistencies effectively to support fine-granular changes. 

We begin with a few definitions required to present the basic dd algorithm. We 
show how its extension dd'^ handles inconsistencies from fine-granular changes. Two 
real-life case studies using our WYNOT prototype' highlight the practical issues; in par- 
ticular, we reveal how the GDB failure was eventually resolved automatically. We close 
with discussions of future and related work, where we recommend delta debugging as 
standard operating procedure after any failing regression test. 

2 Configurations, Tests, and Failures 

We first discuss what we mean by configurations, tests, and failures. Our view of a 
configuration is the broadest possible: 

Definition I (Configuration). Let C = {Ai, A 2 , ■■■, An} be the set of all possible 
changes A,-. A change set c is called a configuration. 

A configuration is constructed by applying changes to a baseline. 

Definition 2 (Baseline). An empty configuration c = Id is called a baseline. 

Note that we do not impose any constraints on how changes may be combined; in 
particular, we do not assume that changes are ordered. Thus, in the worst case, there are 
2" possible configurations for n changes. 

To determine whether a failure occurs in a configuration, we assume a testing func- 
tion. According to the POSIX 1003.3 standard fortesting frameworks [3], we distinguish 
three outcomes: 

' WYNOT = “Worked Yesterday, NOt Today” 




Yesterday, my Program Worked. Today, it Does Not. Why? 255 



- The test succeeds (PASS, written here as ) 

- The test has produced the failure it was indented to capture (FAIL, ) 

- The test produced indeterminate results (UNRESOLVED, 7 )? 

Definition 3 (Test). The function test : 2^ ^ { , , ?} determines for a configura- 

tion c e C whether some given failure occurs () or not ( ) or whether the test is 
unresolved (?). 

In practice, test would construct the configuration from the given changes, run a 
regression test on it and return the test outcome.^ 

Let us now model our initial scenario. We have some configuration “yesterday” that 
works fine and some configuration “today” that fails. For simplicity, we only consider 
the changes present “today”, hut not “yesterday”. Thus, we model the “yesterday” con- 
figuration as baseline and the “today” configuration as set of all possible changes. 

Axiom 1 (Worked yesterday, not today). teit(0) = (“yesterday”) and test(C) = 
(“today”) hold. 

What do we mean by changes that cause a failure? We are looking for a specific 
change set — those changes that make the program fail by including them in a configu- 
ration. We call such changes failure- inducing. 

Definition 4 (Failure-inducing change set). A change set c is failure-inducing if 

Vc' (c c c' c C ^ test(c') ) 

holds. 

The set of all changes C is failure-inducing by definition. However, we are more 
interested in finding the minimal failure-inducing subset of C, such that removing any 
of the changes will make the program work again; 

Definition 5 (Minimal failure-inducing set). A failure-inducing change set B c is 
minimal if 

Wc C B (lest(c) f ) 

holds. 

And exactly this is our goal: For a configuration C, to find a minimal failure-inducing 
change set. 

3 Configuration Properties 

If every change combination produced arbitrary test results, we would have no choice 
but to test all 2" configurations. In practice, this is almost never the case. Instead, con- 
figurations fulfill one or more specific properties that allow us to devise much more 
efficient search algorithms. 

^ posix 1003.3 also lists untested and unsupported outcomes, which are of no relevance here. 
^ A single test case may take time. Recompilation and re-execution of a program may be a matter 
of several minutes, if not hours. This time can be considerably reduced by smart recompilation 
techniques [7] or caching derived objects [4]. 
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The first useful property is monotony: once a change causes a failure, any configu- 
ration that includes this change fails as well. 

Definition 6 (Monotony). A configuration C is monotone if 

Vc c C (test(c) = Vc' 3 c (test(c') f )) (1) 

holds. 

Why is monotony so useful? Because once we know a change set does not cause a 
failure, so do all subsets; 

Corollary 1. Let Cbe a monotone configuration. Then, 

Vc c C (test(c) = ^ V c' c c(test(c') )) (2) 

holds. 

Proof. By contradiction. For all configurations c ^ C with test(c) = , assume that 

3c' C c(test{c’) = ) holds. Then, definition 6 implies test{c) f , which is not the 

case. 

Another useful property is unambiguity: a failure is caused by only one change 
set (and not independently by two disjoint ones). This is mostly a matter of economy: 
once we have detected a failure-inducing change set, we do not want to search the 
complement for more failure-inducing change sets. 

Definition 7 (Unambiguity). A configuration C is unambiguous if 

Vci, C 2 c C (test(ci) = A test(c 2 ) = test{c\ C\C 2 ) f ) (3) 

holds. 

The third useful property is consistency: every subset of a configuration returns an 
determinate test result. This means that applying any combination of changes results in 
a testable configuration. 

Definition 8 (Consistency). A configuration C is consistent if 

Wc C (test(c) f ?) 

holds. 

If a configuration does not fulfill a specific property, there are chances that one of 
its subsets fulfills them. This is the basic idea of the divide-and-conquer algorithms 
presented below. 

4 Finding Failure-Inducing Changes 

For presentation purposes, we begin with the simplest case: a configuration c that is 
monotone, unambiguous, and consistent. (These constraints will be relaxed bit by bit in 
the following sections.) For such a configuration, we can design an efficient algorithm 
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based on binary search to find a minimal set of failure-inducing changes. If c contains 
only one change, this change is failure-inducing by definition. Otherwise, we partition c 
into two subsets c\ and C 2 and test each of them. This gives us three possible outcomes: 

Found in ci. The test of c\ fails — ci contains a failure-inducing change. 

Found in C 2 - The test of C 2 fails — C 2 contains a failure-inducing change. 

Interference. Both tests pass. Since we know that testing c = ci U C 2 fails, the failure 
must be induced by the combination of some change set in ci and some change set 
in C 2 . 



In the first two cases, we can simply continue the search in the failing subset, as 
illustrated in Table 1. Each line of the diagram shows a configuration. A number i 
stands for an included change A, ; a dot stands for an excluded change. Change 7 is the 
one that causes the failure — and it is found in just a few steps. 



Step 


Ci 


Configuration 


test 


1 


Cl 


1 2 3 4. . . . 




2 


C2 


. . . .5 6 7 8 




3 


Cl 


....56.. 




4 


C2 


7 8 




5 


Cl 


7 . 




Result 


7 . 





7 is found 



Table 1. Searching a single failure-inducing change 



But what happens in case of interference? In this case, we must search in both 
halves — with all changes in the other half remaining applied, respectively. This variant 
is illustrated in Table 2. The failure occurs only if the two changes 3 and 6 are applied 
together. Step 3 illustrates how changes 5-7 remain applied while searching through 1- 
4; in step 6, changes 1^ remain applied while searching in 5-7.^ 
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Cl 
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3 
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5 


6 








7 


Cl 


1 


2 


3 


4 


5 










Result 






3 






6 









3 is found 
6 is found 



Table 2. Searching two failure-inducing changes 

We can now formalize the search algorithm. The function dd(c) returns all failure- 
inducing changes in c; we use a set r to denote the changes that remain applied. 

^ Delta debugging is not restricted to programs alone. On this lATgX document, 14 iterations of 
manual delta debugging had to be applied until Table 2 eventually re-appeared on the same 
page as its reference. 
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Algorithm 1 (Automated delta debugging). The automated delta debugging algo- 
rithm dd(c) is 

dd{c) = dd2{c, 0 ) where 

dd2{c, r) = let ci, C2 c c with ci U C2 = c, ci n C2 = 0 , |ci | ~ |c2| |c |/2 

c if |c| = 1 (“found”) 

dd2(ci,r) else if tot(ci U r) = (“inci”) 

in 

dd2(c2, r) else if test{c2 U r) = (“in C2”) 

dd2ic\, C2 U r) U dd2(c2, ci U r) otherwise (“interference”) 



The recursion invariant (and thus precondition) for dd2 is test(r) = A test{c Ur) = 



The basic properties of dd are discussed and proven in [ 9 ]. In particular, we show 
that dd{c) returns a minimal set of failure-inducing changes in c if c is monotone, un- 
ambiguous, and consistent. 

Since dd is a divide-and-conquer algorithm with constant time requirement at each 
invocation, dd's time complexity is at worst linear. This is illustrated in Table 3 , where 
only the combination of all changes is failure-inducing, and where dd requires less than 
two tests per change to find them. If there is only one failure-inducing change to be 
found, dd even has logarithmic complexity, as illustrated in Table 1 . 
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Result 


T 


2 


3 
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6 


7 


8 





2 is found 
1 is found 
4 is found 

3 is found 



6 is found 
5 is found 
8 is found 

7 is found 



Table 3. Searching eight failure-inducing changes 

Let us now recall the properties dd requires from configurations: monotony, unam- 
biguity, and consistency. How does dd behave when c is not monotone or when it is 
ambiguous? In case of interference, dd still returns a failure-inducing change set, al- 
though it may not be minimal. But maybe surprisingly, a single failure-inducing change 
(and hence a minimal set) is found even for non-monotone or ambiguous configura- 
tions: 

- If a configuration is ambiguous, multiple failure-inducing changes may occur; dd 
returns one of them. (After undoing this change set, re-run dd to find the next one.) 
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- If a configuration is not monotone, then we can devise “undoing” changes that, 
when applied to a previously failing configuration c, cause c to pass the test again. 
But still, today’s configuration is failing; hence, there must he another failure- 
inducing change that is not undone and that can be found by dd. 

5 Handling Inconsistency 

The most important practical problem in delta debugging is inconsistent configurations . 
When combining changes in an arbitrary way, such as done by dd, it is likely that several 
resulting configurations are inconsistent — the outcome of the test cannot be determined. 
Here are some of the reasons why this may happen; 

Integration failure. A change cannot be applied. It may require earlier changes that 
are not included in the configuration. It may also be in conflict with another change 
and a third conflict-resolving change is missing. 

Construction failure. Although all changes can be applied, the resulting program has 
syntactical or semantical errors, such that construction fails. 

Execution failure. The program does not execute correctly; the test outcome is unre- 
solved. 

Since it is improbable that all configurations tested by dd have been checked for 
inconsistencies beforehand, tests may well outcome unresolved during a dd run. Thus, 
dd must be extended to deal with inconsistent configurations. 

Let us begin with the worst case: after splitting up c into subsets, all tests are 
unresolved — ignorance is complete. How we increase our chances to get a resolved 
test? We know two configurations that are consistent: 0 (“yesterday”) and C (“today”). 
By applying less changes to “yesterday’s” configuration, we increase the chances that 
the resulting configuration is consistent — the difference to “yesterday” is smaller. Like- 
wise, we can remove less changes from “today’s” configuration and decrease the differ- 
ence to “today”. 

In order to apply less changes, we can partition c into a larger number of subsets. 
The more subsets we have, the smaller they are, and the bigger are our chances to get 
a consistent configuration — until each subset contains only one change, which gives us 
the best chance to get a consistent configuration. The disadvantage, of course, is that 
more subsets means more testing. 

To extend the basic dd algorithm to work on an arbitrary number n of subsets 
Cl , . . . ,Cn, we must distinguish the following cases: 

Found. If testing any c, fails, then c, contains a failure-inducing subset. This is just as 
in dd. 

Interference. If testing any Ci passes and its complement Ci passes as well, then the 
change sets c, and c, form an interference, just as in dd. 

Preference. If testing any c, is unresolved, and testing c, passes, then c, contains a 
failure-inducing subset and is preferred. In the following test cases, Ci must remain 
applied to promote consistency. 
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As a preference example, consider Table 4. In Step 1, testing ci turns out unre- 
solved, but its complement ci = C 2 passes the test in Step 2. Consequently, C 2 can- 
not contain a bug-inducing change set, but c\ can — possibly in interference with C 2 , 
which is why C 2 remains applied in the following test cases. 



Step 


Ci 


Configuration 


test 


1 


Cl 


1 2 3 4. . . . 


? Testing c\,C2 


2 


C 2 


. . . .5 6 7 8 


=)• Prefer c\ 


3 


Cl 


1 2. .5 6 7 8 





Table 4. Preference 

Try again. In all other cases, we repeat the process with 2n subsets — resulting with 
twice as many tests, but increased chances for consistency. 

As a “try again” example, consider Table 5. Change 8 is failure-inducing, and 
changes 2, 3 and 7 imply each other — that is, they only can be applied as a whole. 
Note how the test is repeated first with n = 2, then with n = 4 subsets. 



Step 


Ci 


1 Configuration 


test 


1 


Cl = C2 


1 


2 


3 


4 










7 


2 


C2 = Cl 










5 


6 


7 


8 


7 


3 


Cl 


1 


2 














7 


4 


C 2 






3 


4 










7 


5 


C3 










5 


6 








6 


C4 














7 


8 


7 


7 


Cl 






3 


4 


5 


6 


7 


8 


7 


8 


C2 


1 


2 






5 


6 


7 


8 


7 


9 


C3 


1 


2 


3 


4 






7 


8 




10 


C4 


1 


2 


3 


4 


5 


6 






7 



Testing ci , C 2 
Try again 
Testing c\, . . . , C4 



Testing complements 



Try again 



Table 5. Searching failure-inducing changes with inconsistencies 



In each new run, we can do a little optimizing: all c, that passed the test can be ex- 
cluded from c, since they cannot be failure-inducing. Likewise, all c, whose com- 
plements Ci failed the test can remain applied in following tests. In our example, 
this applies to changes 5 and 6, such that we can continue with n = 6 subsets. 
After testing each change individually, we finally find the failure-inducing change, 
as shown in Table 6. 
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8 is found 


Result 
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Table 6. Searching failure-inducing changes with inconsistencies (continued) 
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Note that at this stage, changes 1, 4, 5 and 6 have already been identified as not 
failure-inducing, since their respective tests passed. If the failure had not been in- 
duced by change 8, but by 2, 3, or 7, we would have found it simply by excluding 
all other changes. 

To summarize, here is the formal definition of the extended dct^ algorithm: 

Algorithm 2 (Delta debugging with unresolved test cases). 

The extended delta debugging algorithm dd^{c) is 

dd'^{c) = dd-i{c, 0, 2) where 
dd-i{c, r, n) = 

let Cl , . . . ,Cn c c such that y c, = c, all c, are pairwise disjoint, 
and Vc,- (|c;| « |c|/«); 

let Ci = c — (ci U r), t, = test(ci U r), ti = test{ci U r), 
c' = c n n{Cj I = }, r' = r U y{c,- | ti = },n = min(|c'|, 2n), 

di — dd^{Ci, Ci U r, 2), and di — dd^{Ci, Ci U r, 2) 

c if |c| = 1 (“found”) 

ddi (ci , r, 2) else if t, = for some i (“found in c, ”) 

di U di else if f, = Ati= for some i (“interference”) 

di else if f, = ? a t) = for some i (“preference”) 

dd^{c' , r', n') else if « < |c| (“try again”) 
c' otherwise (“nothing left”) 

The recursion invariant for dd^ is test{r) A testlc U r) A « < |c|. 

Apart its extensions for unresolved test cases, the dd^ function is identical to dd 2 
with an initial value of n = 2. Like dd, dd^ has linear time complexity (but requires 
twice as many tests). 

Eventually, dd~^ finds a minimal set of failure-inducing changes, provided that they 
are safe — that is, they can either be applied to the baseline or removed from today’s 
configuration without causing an inconsistency. If this condition is not met, the set 
returned by dd~^ may not be minimal, depending on the nature of inconsistencies en- 
countered. But at least, all changes that are safe and not failure-inducing are guaranteed 
to be excluded.^ 

6 Avoiding Inconsistency 

In practice, we can significantly reduce the risk of inconsistencies by relying on spe- 
cific knowledge about the nature of the changes. There are two ways to influence the 
dd~^ algorithm: 

^ True minimality can only be achieved by testing all 2” configurations. Consider a hypothetic 
set of changes where only three configurations are consistent: yesterday’s, today’s, and one 
arbitrary configuration. Only by trying all combinations can we find this third configuration; 
inconsistency has no specific properties like monotony that allow for more effective methods. 
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Grouping Related Changes. Reconsider the changes 2, 3, and 7 of Table 5. If we 
had some indication that the changes imply each other, we could keep them in a 
common subset as long as possible, thereby reducing the number of unresolved test 
cases. To determine whether changes are related, one can use 

- process criteria, such as common change dates or sources, 

- location criteria, such as the affected file or directory, 

- lexical criteria, such as common referencing of identifiers, 

- syntactic criteria, such as common syntactic entities (functions, modules) af- 
fected by the change, 

- semantic criteria, such as common program statements affected by the changed 
control flow or changed data flow. 

For instance, it may prove useful to group changes together that all affect a specific 
function (syntactic criteria) or that occurred at a common date (process criteria). 
Predicting Test Outcomes. If we have evidence that specific configurations will be 
inconsistent, we can predict their test outcomes as unresolved instead of carrying 
out the test. In Table 5, if we knew about the implications, then only 5 out of 16 
tests would actually be carried out. 

Predicting test outcomes is especially useful if we can impose an ordering on the 
changes. Consider Table 7, where each change A, implies all “earlier” changes 
Ai, . . . , A,_i. Given this knowledge, we can predict the test outcomes of steps 
2 and 4; only three tests would actually carried out to find the failure-inducing 
change. 



Step 


Ci 


Configuration 


test 
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1 2 3 4. . . . 
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1 2 3 4 5 6. . 
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Result 


7 . 





predicted outcome 

predicted outcome 
7 is found 



Table 7. Searching failure-inducing changes in a total order 

We see that when changes can be ordered, predicting test outcomes makes dd^ act 
like a binary search algorithm. 

Both grouping and predicting will be used in two case studies, presented below. 

7 First Case Study: DDD 3.1.2 Dumps Core 

DDD 3.1.2, released in December, 1998, exhibited a nasty behavioral change: When 
invoked with a the name of a non-existing file, DDD 3.1.2 dumped core, while its pre- 
decessor DDD 3.1.1 simply gave an error message. We wanted to find the cause of this 
failure by using WYNOT. 

The DDD configuration management archive lists 116 logical changes between the 
3.1.1 and 3.1.2 releases. These changes were split into 344 textual changes to the DDD 



source. 
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Delta Debugging Log 




Delta Debugging Log 



(a) with random clustering 




(b) with date clustering 



Table 8. Searching a failure-inducing change in DDD 



In a first attempt, we ignored any knowledge about the nature or ordering of the 
changes; changes were ordered and partitioned at random. Table 8(a) shows the re- 
sult of the resulting WYNOT run. After test #4, WYNOT has reduced the number of 
remaining changes to 172. The next tests turn out unresolved, so WYNOT gradually 
increases the number of subsets; at test #16, WYNOT starts using 8 subsets, each con- 
taining 16 changes. At test #23, the 7th subset fails, and only its 16 changes remain. 
Eventually, test #3 1 determines the failure-inducing change. 

We then wanted to know whether knowledge from the configuration management 
archive would improve performance. We used the following process criteria: 

1 . Changes were grouped according to the date they were applied. 

2. Each change implied all earlier changes. If a configuration would not satisfy this 
requirement, its test outcome would be predicted as unresolved. 

As shown in Table 8(b), this resulted in a binary search with very few inconsisten- 
cies. After only 12 test runs and 58 minutes^, the failure-inducing change was found; 

diff -rl.30 -rl.30.4.1 ddd/gdbinit . C 
295 , 296C296 

< String classpath = 

< getenvC CLASSPATH") != 0 ? getenv (" CLASSPATH " ) ; " . " ; 



> string classpath = source.view- >class.path ( ) ; 

When called with an argument that is not a file name, DDD 3.1.1 checks whether 
it is a Java class; so DDD consults its environment for the class lookup path. As an 
“improvement”, DDD 3.1.2 uses a dedicated method for this purpose. Unfortunately, 
the source_view pointer used is initialized only later, resulting in a core dump. This 
problem has been fixed in the current DDD release. 



8 Second Case Study: GDB 4.17 Does Not Integrate 

Let us now face greater challenges. As motivated in Section 1, we wanted to track 
down a failure in 178,000 changed GDB lines. In contrast to the DDD setting from 

^ All times were measured on a Linux PC with a 200 MHz AMD K6 processor. 
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Section 7, we had no configuration management archive from which to take ordered 
logical changes. 

The 178,000 lines were automatically grouped into 8721 textual changes in the 
GDB source, with any two textual changes separated by at least two unchanged lines 
(“context”). The average reconstruction time after applying a change turned out to be 
370 seconds. This means that we could run 233 tests in 24 hours or 8721 changes 
individually in 37 days. 

Again, we first ignored any knowledge about the nature of the changes. The result 
of this WYNOT run is shown in Table 9(a). Most of the first 457 tests turn out unre- 
solved, so WYNOT gradually increases the number of subsets, reducing the number of 
remaining changes. At test #458, each subset contains only 36 changes, and it is one of 
these subsets that turns out to be failure-inducing. After this breakthrough, the remain- 
ing 12 tests determine a single failure-inducing change. 

Running the 470 tests still took 48 hours. Once more, we decided to improve perfor- 
mance. Since process criteria were not available, we used location criteria and lexical 
criteria to group changes: 

1. At top-level, changes were grouped according to directories. This was motivated 
by the observation that several GDB directories contain a separate library whose 
interface remains more or less consistent across changes. 

2. Within one directory, changes were grouped according to common files. The idea 
was to identify compilation units whose interface was consistent with both “yester- 
day’s” and “today’s” version. 

3. Within a file, changes were grouped according to common usage of identifiers. 
This way, we could keep changes together that operated on common variables or 
functions. 

Finally, we added a failure resolution loop: After a failing construction, WYNOT 
scans the error messages for identifiers, adds all changes that reference these identifiers 
and tries again. This is repeated until construction is possible, or until there are no more 
changes to add. 

The result of this WYNOT run is shown in Table 9(b). At first, WYNOT split the 
changes according to their directories. After 9 tests with various directory combinations, 
WYNOT has a breakthrough: the failure-inducing change is to be found in one specific 
directory. Only 2547 changes are left. 

A long period without significant success follows; WYNOT partitions changes into 
an increasing number of subsets. The second breakthrough occurs at test #280, where 
each subset contains only 18 changes and where WYNOT narrows down the number 
of changes to a subset of two files only. The end comes at test #289, after a total of 
20 hours. We see that the lexical criteria reduced the number of tests by 38% and the 
total running time by more than 50%. 

In both cases, WYNOT broke down the 178,000 lines down to the same one-line 
change line that, being applied, causes DDD to malfunction: 

diff -r gdb-4 . 16/gdb/inf cmd . c gdb-4 . 17/gdb/inf cmd. c 
1239C1278 

< "Set arguments to give program being debugged when it is started. \n\ 

> "Set argument list to give program being debugged when it is started. \n\ 
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Table 9. Searching a failure-inducing change in GDB 

This change in a string constant from arguments to argument list was 
responsible for GDB 4.17 not interoperating with DDD. Given the command show 
args, GDB 4.16 replies 

Arguments to give program being debugged when it is started is "a b c" 

but GDB 4.17 issues a slightly different (and grammatically correct) text: 

Argument list to give program being debugged when it is started is "a b c" 

which could not he parsed by DDD! To solve the problem here and now, we simply 
reversed the GDB change; eventually, DDD was upgraded to make it work with the new 
GDB version, too. 



9 Related Work 

There is only one other work on automated delta debugging we have found: the paper on 
regression containment by Ness and Ngo [5], presented in Section Ness and Ngo use 
simple linear and binary search to identify a single failure-inducing change. Their goal, 
however, lies not in debugging, but in isolating (i.e. removing) the failure-inducing 

^ Ness and Ngo cite no related work, so we assume they found none either. 
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change such that development of the product is not delayed by resolving the failure. 
The existence of a configuration management archive with totally ordered changes is 
assumed; issues like interference, inconsistencies, granularity, or non-monotony are nei- 
ther handled nor discussed. 

Consequently, the failure-inducing change in GDB from Section 8 would not be 
found at all since there is no configuration management archive from which to take 
logical changes; in the DDD setting from Section 7, the logical change would be found, 
but could not have been broken down into this small chunk. 

10 Conclusions and Future Work 

Delta debugging resolves regression causes automatically and effectively. If configu- 
ration information is available, delta debugging is easy; otherwise, there are effective 
techniques that indicate change dependencies. Although resource-intensive, delta de- 
bugging requires no manual intervention and thus saves valuable developer time. 

We recommend that delta debugging be an integrated part of regression testing; 
each time a regression test fails, a delta debugging program should be started to resolve 
the regression cause. The algorithms presented in this paper provide successful delta 
debugging solutions that handle difficult details such as interferences, inconsistencies, 
and granularity. 

Our future work will concentrate on avoiding inconsistencies by exploiting domain 
knowledge. Most simple configuration management archives enforce that each change 
implies all earlier changes; we want to use full-fledged constraint systems instead [11]. 
Another issue is to use syntactic criteria in order to group changes by affected func- 
tions and modules. The most complicated, but most promising approach are semantic 
criteria: Given a change and a program, we can determine a slice of the program where 
program execution may be altered by applying the change. Such slices have been suc- 
cessfully used for semantics-preserving program integration [2] as well as for determin- 
ing whether a regression test is required after applying a specific change [1]. The basic 
idea is to determine two program dependency graphs (PDGs) — one for “yesterday’s” 
and one for “today’s” configuration. Then, for each change c and each PDG, we deter- 
mine the forward slice from the nodes affected by c. We can then group changes by the 
common nodes contained in their respective slices; two changes with disjoint slices end 
up in different partitions. 

Besides consistency issues, we want to use code coverage tools in order to exclude 
changes to code that is never executed. The intertwining of changes to construction 
commands, system models, and actual source code must be handled, possibly by multi- 
version system models [8]. Further case studies will validate the effectiveness of all 
these measures, as of delta debugging in general. 

Acknowledgments. Carsten Schulz contributed significantly to the current WYNOT 
implementation. The first delta debugging prototype was implemented by Ulrike Heuer. 
Jens Krinke, Christian Lindig, Kerstin Reese, Torsten Robschink, Gregor Snelting, and 
Paul Strooper provided valuable comments on earlier revisions of this paper. 

Further information on delta debugging, including the full WYNOT implementation, 
is available at http : //www . fmi . uni -passau . de/st/wynot/. 
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Abstract. We present a novel approach to avoid the debugging of op- 
timized code through comparison checking. In the technique presented, 
both the unoptimized and optimized versions of an application program 
are executed, and computed values are compared to ensure the behav- 
iors of the two versions are the same under the given input. If the values 
are different, the comparison checker displays where in the application 
program the differences occurred and what optimizations were involved. 
The user can utilize this information and a conventional debugger to 
determine if an error is in the unoptimized code. If the error is in the 
optimized code, the user can turn off those offending optimizations and 
leave the other optimizations in place. We implemented our comparison 
checking scheme, which executes the unoptimized and optimized versions 
of C programs, and ran experiments that demonstrate the approach is 
effective and practical. 



1 Introduction 

Although optimizations are important in improving the performance of pro- 
grams, an application programmer typically compiles a program during the de- 
velopment phase with the optimizer turned off. After the program is tested and 
apparently free of bugs, it is ready for production. The user then compiles the 
program with the optimizer turned on to take advantage of the performance 
improvement offered by optimizations. However, when the application is opti- 
mized, its semantic behavior may not be the same as the unoptimized program; 
we consider the semantic behaviors of an unoptimized program and its optimized 
program version to be the same if all corresponding statements executed in both 
programs compute the same values under given inputs. In this situation, the 
programmer is likely to assume that errors in the optimizer are responsible for 
the change in semantic behavior and, lacking any more information, turn all the 
optimizations off. 
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Differences in semantic behaviors between unoptimized and optimized pro- 
gram versions are caused by either (1) the application of an unsafe optimization, 
(2) an error in the optimizer, or (3) an error in the source program that is 
exposed by the optimization. For instance, reordered operations under certain 
conditions can cause overflow or underflow or produce different floating point 
values. The optimized program may crash (e.g., division by zero) because of 
code reordering. The application of an optimization may assume that the source 
code being transformed follows a programming standard (c.g., ANSI standard), 
and if the code does not, then an error can be introduced by the optimization. 
The optimizer itself may also contain an error in the implementation of a par- 
ticular optimization. And lastly, the execution of the optimized program may 
uncover an error that was not detected in the unoptimized program (e.g., unini- 
tialized variable, stray pointer, array index out of bounds). Thus, for a number 
of reasons, a program may execute correctly when compiled with the optimizer 
turned off but fail when the optimizer is turned on. 

Determining the cause of errors in an optimized program is hampered by 
the limitations of current techniques for debugging optimized code. Source level 
debugging techniques for optimized code have been developed but they either 
constrain debugging features, limit optimizations, modify the code, or burden 
the user with understanding how optimizations affect the source level program. 
For example, to locate an error in the optimized program, the user must step 
through the execution of the optimized program and examine values of vari- 
ables to find the statement that computes an incorrect value. Unfortunately, 
if the user wishes to observe the value of a variable at some program point, 
the debugger may not be able to report this value because the value has not 
been computed yet or was overwritten. Techniques to recover the values for 
reporting purposes work in limited situations and for a limited set of optimiza- 
tions [13,10,25,19,7,12,14,4,18,5,6,23,3,24,8]. 

In this paper, we present comparison checking, a novel approach that avoids 
debugging of optimized code. The comparison checking scheme is illustrated in 
Figure 1. The user first develops, tests, and debugs the unoptimized program 
using a conventional debugger and once the program appears free of bugs, the 
program is optimized. The comparison checker orchestrates the executions of 
both the unoptimized and optimized versions of a source program under a num- 
ber of inputs and compares the semantic behaviors of both program versions 
for each of the inputs; comparisons arc performed on values computed by cor- 
responding executed statements from both program versions. If the semantic 
behaviors are the same and correct, the optimized program can be run with 
high confidence. On the other hand, if the semantic behaviors differ, the checker 
displays the statements responsible for the differences and optimizations applied 
to these statements. The user can use this information and a conventional de- 
bugger to determine if the error is in the unoptimized or optimized code. If the 
user finds the problem to be in the unoptimized version, modifications are per- 
formed and the checking is repeated. If the error is in the optimized code, the 
user can turn off those offending optimizations in the effected parts of the source 
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Fig. 1. The Comparison Checking System 



program. The difference would be eliminated without sacrificing the benefits of 
correctly applied optimizations. 

Our system can locate the earliest point where both programs differ in their 
semantic behavior. That is, the checker detects the earliest point during execu- 
tion time when corresponding statement instances should but do not compute 
the same values. For example, if a difference is due to an uninitialized variable, 
our scheme detects the first incorrect use of the value of this variable. If a dif- 
ference is due to the optimizer, this scheme helps locate the statement that was 
incorrectly optimized. In fact, this scheme proved very useful in debugging our 
optimizer that we implemented for this work. 

The merits of a comparison checking system are as follows. 

— The user debugs the unoptimized version of the program, and is therefore 
not burdened with understanding the optimized code. 

— The user has greater confidence in the correctness of the optimized program. 

— When a comparison fails, we report the earliest place where the failure oc- 
curred and the optimizations that involved the statement. Information about 
where an optimized program differs from the unoptimized version benefits 
the user in tracking down the error as well as the optimizer writer in debug- 
ging the optimizer. 

— A wide range of optimizations including classical optimizations, register allo- 
cation, loop transformations, and inlining can be handled by the technique. 
Many of these optimizations are not supported by current techniques to 
debug optimized code. 

— The optimized code is not modified except for breakpoints, and thus no 
recompilation is required. 
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The design of comparison checking has several challenges. We must decide 
what values computed in both the unoptimized and optimized programs should 
be compared and how to associate those values in both programs. These tasks 
are achieved by generating mappings between corresponding instances of state- 
ments in the unoptimized and optimized programs as optimizations are applied. 
We must also decide how to execute both programs and when the comparisons 
should be performed. Since values can be computed out of order, a mechanism 
must save values that arc computed early. These values arc saved in a value 
pool and removed when no longer needed. Finally, the above tasks must be au- 
tomated. The mappings are used to automatically generate annotations for the 
optimized and unoptimized programs. These annotations guide the comparison 
checker in comparing corresponding values and addresses. Checking is performed 
for corresponding assignments of values to source level variables and results of 
corresponding branch predicates. In addition, for assignments through arrays 
and pointers, cheeking is done to ensure the addresses to which the values are 
assigned correspond to each other. All assignments to source level variables are 
compared with the exception of those dead values that are never computed in the 
optimized code. We implemented the scheme, which executes the unoptimized 
and optimized versions of C programs. We also present our experience with the 
system and experimental results demonstrating the practicality of the system. 

While our comparison checking scheme is generally applicable to a wide range 
of optimizations from simple code reordering transformations to loop transfor- 
mations, this paper focuses on classical statement level optimizations. Optimiza- 
tions can be performed at the source, intermediate, or target code level, and our 
checker is language independent. Mappings between the source and intermedi- 
ate/target code are maintained to report differences in terms of the source level 
statements. 

This paper is organized by demonstrating the usefulness of the comparison 
checking scheme in Section 2. An overview of the mappings is presented in Sec- 
tion 3. Sections 4 and 5 describe the annotations and the comparison checker. 
Section 6 provides an example of comparison checking. Section 7 presents exper- 
imental results. Related work is discussed in Section 8, and concluding remarks 
are given in Section 9. 



2 Comparison Checking Scenarios 

In this section, we demonstrate the usefulness of comparison checking using ex- 
amples. Consider the unoptimized C program fragment and its optimized version 
in Figure 2(a). Assume the unoptimized program has been tested, the optimizer 
is turned on, and when the optimized program executes, it returns incorrect out- 
put. The natural inclination of the user is to think that an error in the optimizer 
caused the optimized program to generate incorrect output. However, using the 
comparison checker, a difference in the program versions is detected at line 7 in 
the unoptimized program. The checker indicates that the value 2 is assigned to 
y in the unoptimized program and the value 22 is assigned to y in the optimized 
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6) 
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if(x<y) 
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if(x<y) 


9) 


printf(... x); 
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printf(... x); 
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(a) Example 1 (b) Example 2 

Fig. 2. Program examples for comparison checking 



program at line 7 during loop iteration 10. The checker also indicates that no 
optimizations were applied to the statement at line 7. To determine if there is 
an error in the unoptimized program or if perhaps a prior optimization caused 
the problem, the user utilizes a conventional debugger and places a breakpoint 
in the unoptimized program at line 7. Upon the last iteration of the loop, the 
user examines the value of i, which is 10, and realizes that 10 is not within the 
bounds of array a (C assigns subscripts 0 ... 9). At this point, the user realizes 
that the error is in the original program. The user can hx the original program, 
generate the new optimized version, and repeat the checking. Notice that the 
user does not need to examine the optimized program. 

What caused the optimized program to generate incorrect output? The cause 
was an optimization that changed the memory layout of the program. Consider 
the execution time behavior of both program versions. The loop stores values 
into array a at statement 5 and overrnns the npper bound of a when i is 10. In 
the nnoptimized program, the assignment to a[i] when i is 10 actually stores a 
value in b. Assuming b is not used in the unoptimized program, the output of the 
program is correct. However, in the optimized program in which storage for b is 
removed because variable b is dead, the assignment to a[i\ when t is 10 actually 
stores a value in x. Since the same value of x is overwritten at line 6 but needed 
in subsequent statements, the output of the optimized code is incorrect. 

Even when the outputs generated by the unoptimized and optimized pro- 
grams are both the same and correct, comparing the internal behaviors of the 
unoptimized and optimized programs can still help users detect errors in the orig- 
inal program. This would happen when an error in the unoptimized program is 
unmasked in the optimized program and affects the output only on certain in- 
puts. Assume the unoptimized program and optimized program in Figure 2(a) 
output the same values at statement 9. The error in the program described above 
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does not affect this output. However, using the comparison checker, a difference 
in the programs is detected at line 7 in the unoptimized program. As mentioned 
above, the user ean utilize the information supplied by the checker and a con- 
ventional debugger to determine that at line 7, array a is indexing out of its 
bounds. 

Differences in the behavior of the unoptimized and optimized programs can 
also be caused by errors in the way optimizations are applied. Consider the ex- 
ample in Figure 2(b). Using the checker and assuming x, y, and z upon entry 
to the procedure one are 1,2, and 3, respectively, a difference is detected in the 
internal behavior of both programs at line 8 in the unoptimized program. The 
checker indicates that the value 3 is assigned in the unoptimized program and 
the value 5 is assigned in the optimized program. The checker indicates that con- 
stant propagation was applied to this statement in the optimized program and 
an operand in line 8 was replaced by the constant 5. The user can once again use 
a conventional debugger to determine if there is an error in the original program. 
The user places a breakpoint in the unoptimized program at line 8. When the 
unoptimized program execution reaches the breakpoint, the user examines the 
values of the operands at line 8 and notices none of the operand’s values are 5. 
The user concludes the value 5 should not have replaced an operand at line 8 
and therefore, the constant propagation optimization has been incorrectly ap- 
plied. The user disables the constant propagation optimization, the checking is 
repeated, and no differences are reported. The difference was eliminated without 
sacrificing the benefits of other correctly applied optimizations. 



3 Mappings 

To compare values computed in both the unoptimized and optimized programs, 
we need to determine the corresponding statement instances that compute the 
values. A statement instance in the unoptimized program and a statement in- 
stance in the optimized program are said to correspond if the values computed 
by the two instances should be the same and the latter was derived from the 
former by the application of some optimizations. Our mappings associate state- 
ment instances in the unoptimized program and the corresponding statement 
instances in the optimized program. In [16], we developed a mapping technique 
to identify corresponding statement instances and describe how to generate 
mappings that support code optimized with classical optimizations as well as 
loop transformations. Since optimizations can be applied in any order and as 
many times as desired, the mappings also summarize the effects of all previously 
applied optimizations. In this section, we use an example to describe the map- 
pings used by the comparison checking scheme to support code optimized with 
classical optimizations. 

In Figure 3, the unoptimized program and the optimized program version 
as well as the mappings are shown. For ease of understanding, the source level 
statements shown are simple. The mapping technique applies when optimizations 
are applied at the source, intermediate, or target code level. The mappings are 
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Fig. 3. Mappings for Unoptimized and Optimized Code 



illustrated by labeled dotted edges between corresponding statements in both 
programs. The following optimizations were applied. 

— constant propagation - the constant 1 in S'! is propagated, as shown in S2 , 
S3 , and S'!! . 

— copy propagation - the copy M in 56 is propagated, as shown by 57 and 
510 . 

— dead code elimination - 51 and 56 are dead after constant and copy propa- 
gation. 

— loop invariant code motion - 55 is moved out of the doubly nested loop. 510 
is moved out of the inner loop. 

— partial redundancy elimination - 59 is partially redundant with 58. 

— partial dead code elimination - 510 is moved out of the outer loop. 

Mapping labels identify the instances in the unoptimized program and the 
corresponding instances in the optimized program. If there is an one-to-one cor- 
respondence between the instances, then the number of instances is the same and 
corresponding instances appear in the same order. If the number of instances is 














Comparison Checking: An Approach to Avoid Debugging of Optimized Code 275 



not the same, a consecutive subsequence of instances in one sequence corresponds 
to a single instance in the other. 

Since code optimizations move, modify, delete, and add statements in a pro- 
gram, the number of instances of a statement can increase, decrease, or remain 
the same in the optimized program as compared to the unoptimized program. If 
a statement is moved out of a loop, then the statement will execute fewer times in 
the optimized code. For example, in loop invariant code motion (see statements 
S5 and S5 ), the statement moved out of the loop will execute fewer times and 
thus all the instances of statement S5 in the loop in the unoptimized code must 
map to one instance of statement S5 in the optimized eode. If a statement is 
moved below a loop, for example by applying partial dead code elimination (see 
statements 5'10 and 5'10 ), then only the last instance of statement 510 in the 
loop in the unoptimized code is mapped to one instance of statement 510 in the 
optimized code. If the optimization docs not move the statement across a loop 
boundary (see statements 52 and 52 ), then the number of instances executed in 
the unoptimized and optimized code is the same and thus one instance of state- 
ment 52 in the unoptimized code is mapped to one instance of statement 52 
in the optimized code. The removal of statements (see statements 51 and 56) 
can cause mappings to be removed because there is no correspondence between 
a statement in the unoptimized code to a deleted statement in the optimized 
code. 



4 Annotations 

Code annotations guide the comparison checking of values computed by corre- 
sponding statement instances from the unoptimized and optimized code. An- 
notations (1) identify program points where comparison checks should be per- 
formed, (2) indicate if values should be saved in a value pool so that they will be 
available when checks are performed, and (3) indicate when a value currently re- 
siding in the value pool can be discarded since all checks involving the value have 
been performed. The selection and placement of annotations are independent of 
particular optimizations and depend only on which and how statement instances 
correspond and the relative positions of corresponding statements in both the 
unoptimized and optimized programs. Data flow analysis, including reachabil- 
ity and postdominance, is used to determine where and what annotations to 
use [15]. Annotations are placed after all optimizations are performed and tar- 
get code has been generated, and therefore, the code to emit the annotations 
can be integrated as a separate phase within a compiler. 

Five different types of annotations are needed to implement our compari- 
son checking strategy. In Figure 4(a), annotations shown in dotted boxes, are 
given for the example in Figure 3. In the following description, Suopt indicates a 
statement in the unoptimized code and Sopt a statement in the optimized code. 

Check Suopt annotation: This annotation is associated with a program point 
in the optimized code to indicate a check of a value computed by statement 
Suopt is to be performed. The corresponding value to be compared is the result 
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Fig. 4. (a) Annotated Unoptimized and Optimized Code 



(b) Check Traces 



of the most recently executed statement in the optimized code. For example, in 
Figure 4(a), the annotation Check S2 is associated with S2 . 

A variation of this annotation is the Check Suopt with Si, Sj, . . . annotation, 
which is associated with a program point in the optimized code to indicate a 
check of a value computed by statement Suopt is to be performed with a value 
computed by one of Suopt ’s corresponding statements Si,Sj, . . . in the optimized 
program. The corresponding value to be compared is cither the result of the 
most recently executed statement in the optimized code or is in the value pool. 

Save Sopt annotation: If a value computed by a statement Sopt cannot be 
immediately compared with the corresponding value computed by the unopti- 
mized code, then the value computed by Sopt must be saved in the value pool. 
In some situations, a value computed by Sopt must be compared with multiple 
values computed by the unoptimized code. Therefore, it must be saved until all 
those values have been computed and compared. The annotation Save Sopt is 
associated with Sopt to ensure the value is saved. In Figure 4(a), the statement 
S5 in the unoptimized code, which is moved out of the loops by invariant code 
motion, corresponds to statement S5 in the optimized code. The value com- 
puted by S5 cannot be immediately compared with the corresponding values 
computed by S5 in the unoptimized code because 55 is executed prior to the 
execution of 55. Thus, the annotation Save 55 is associated with 55 . 
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Delay Suopt and Checkable Suopt annotations: If the value computed by the 
execution of a statement Suopt cannot be immediately compared with the cor- 
responding value computed by the optimized code because the correspondence 
between the values cannot be immediately established, then the value of Suopt 
must be saved in the value pool. The annotation Delay Suopt is associated with 
Suopt to indicate the checking of the value computed by Suopt should be delayed, 
saving the value in the value pool. The point in the unoptimized code at which 
checking can finally be performed is marked using the annotation Checkable 
Suopt ■ 

In some situations, a delay check is needed because the correspondence be- 
tween statement instances cannot be established unless the execution of the 
unoptimized code is further advanced. In Figure 4(a), statement S'lO inside the 
loops in the unoptimized code is moved after the loops in the optimized code 
by partial dead code elimination. In this situation, only the value computed by 
statement S'lO during the last iteration of the nested loops is to be compared 
with the value computed by S'lO . However, an execution of S'lO corresponding 
to the last iteration of the nested loops can only be determined when the exe- 
cution of the unoptimized code exits the loops. Therefore, the checking of S'lO’s 
value is delayed. 

Delete S: This annotation is associated with a program point in the unopti- 
mized/optimized code to indicate a value computed previously by S and stored 
in the value pool can be discarded. Since a value may be involved in multiple 
checks, a delete annotation must be introduced at a point where all relevant 
checks would have been performed. In Figure 4(a), the annotation Delete S5 
is introduced after the loops in the optimized code because at that point, all 
values computed by statement S5 in the unoptimized code will certainly have 
been compared with the corresponding value computed by S'5 in the optimized 
code. 

Check-self S\ This annotation is associated with a program point in the 
unoptimized/optimized code and indicates that values computed by S must be 
compared against each other to ensure the values arc the same. This annotation 
causes a value of S to be saved in the value pool. 

5 Comparison Checker 

The comparison checker compares values computed by both the unoptimized 
and optimized program executions to ensure the semantic behaviors of both pro- 
grams are the same. The unoptimized and optimized programs are annotated 
with actions that guide the comparison checking process. Once a program point 
is reached, the actions associated with the annotation are executed by the com- 
parison checker. To avoid modifying the unoptimized and optimized programs, 
breakpoints are used to extract values from the unoptimized and optimized pro- 
grams as well as activate annotations. 

A high level conceptual overview of the comparison checker algorithm is 
given in Figure 5. Execution begins in the unoptimized code and proceeds un- 
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Process annotations at breakpoints 
in the Unoptimized Program: 



Process annotations at breakpoints 
in the Optimized Program: 



If delay comparison check annotation then 
save value computed 
If no delay annotation then 

switch execution to the optimized program 
to perform the check on the value 
If delete value annotation then 
discard saved value 

If comparison check annotation on a delayed 
check then 

perform the comparison check 
and if error then report error 



If save annotation then 
save value computed 
If delete annotation then 
discard saved value 
If comparison check annotation then 
perform the comparison check 
and if error then report error 
switch execution to the unoptimized program 



Fig. 5. Comparison Checker Algorithm 



til a breakpoint is reached. Using the annotations, the checker can determine 
if the value computed can be checked at this point. If so, the optimized pro- 
gram executes until the corresponding value is computed (as indicated by an 
annotation), at which time the check is performed on the two values. During 
the execution of the optimized program, any values that are computed “early” 
(i.c., the corresponding value in the unoptimized code has not been computed 
yet) are saved in the value pool, as directed by the annotations. If annotations 
indicate the checking of the value computed by the unoptimized program can- 
not be performed at the current point, the value is saved for future checking. 
The checker continues to alternate between executions of the unoptimized and 
optimized programs. Annotations also indicate when values that were saved for 
future checking can Enally be checked and when the values can be removed from 
the value pool. Any statement instances eliminated in the optimized code are 
not checked. 

6 Comparison Checking Scheme Example 

Consider the unoptimized and optimized program segments in Figure 4. Assume 
all the statements shown are source level statements and loops execute for a 
single iteration. Breakpoints are indicated by circles. The switching between the 
unoptimized and optimized program executions by the checker is illustrated by 
the traces. The traces include the statements executed as well as the breakpoints 
(circled) where annotations are processed. The arrows indicate the switching 
between programs. 

The unoptimized program starts to execute with 51 and continues execut- 
ing without checking, as 51 was eliminated from the optimized program. Af- 
ter 52 executes, breakpoint 1 is reached and the checker determines from the 
annotation that the value computed can be checked at this point and so the 
optimized program executes until Check 52 is processed, which occurs at break- 
point 2. The values computed by 52 and 52 are compared. The unoptimized 
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program resumes execution and the loop iteration at S3 begins. After S3 exe- 
cutes, breakpoint 3 is readied and the optimized program executes until Check 
S3 is processed. Since a number of comparisons have to be performed using 
the value computed by S'5 , when breakpoint 4 is reached, the annotation Save 
S3 is processed and consequently, the value computed by S'5 is stored in the 
value pool. The optimized code continues executing until breakpoint 5, at which 
time the annotation Check S3 is processed. The values computed by S3 and S3 
arc compared. S4 then executes and its value is cheeked. S5 then executes and 
breakpoint 8 is encountered. The optimized program executes until the value 
computed by S5 can be compared, indicated by the annotation Check S5 with 
S5 at breakpoint 9. The value of S5 saved in the value pool is used for the 
check. The programs continue executing in a similar manner. 

7 Implementation of the Comparison Checking Scheme 

We implemented our Comparison checking of Optimized code scheme, called 
COP, to test our algorithms for instruction mapping, annotation placement, and 
checking. Lee [9] was used as the compiler for the application program and was 
extended to include a set of optimizations, namely loop invariant code motion, 
dead code elimination, partial redundancy elimination, copy propagation, and 
constant propagation and folding. On average, the optimized code executes 16% 
faster in execution time than the unoptimized code. 

As a program is optimized, mappings are generated. Besides generating tar- 
get code, Icc was extended to generate a file containing breakpoint information 
and annotations that are derived from the mappings; the code to emit breakpoint 
information and annotations is integrated within Icc through library routines. 
Thus, compilation and optimization of the application program produce the 
target code for both the unoptimized program and optimized program as well 
as auxiliary files containing breakpoint information and annotations for both 
the unoptimized and optimized programs. These auxiliary files are used by the 
checker. Breakpoints are generated whenever the value of a source level assign- 
ment or a predicate is computed and whenever array and pointer addresses are 
computed. Breakpoints are also generated to save base addresses for dynami- 
cally allocated storage of structures (e.g., malloc(), free(), etc.). Array addresses 
and pointer addresses are compared by actually comparing their offsets from 
the closest base addresses collcetcd by the chceker. Floating point numbers are 
compared by allowing for inexact equality; that is, two floating point numbers 
are allowed to differ by a certain small delta [21]. Breakpointing is implemented 
using fast breakpoints [17]. 

Experiments were performed to assess the practicality of COP. Our main 
concerns were usefulness as well as cost of the comparison checking scheme. COP 
was found to be very usehil in actually debugging our optimizer. Errors were 
easily detected and located in the implementation of the optimizations as well 
as in the mappings and annotations. When an unsuccessful comparison between 
two values was detected, COP indicated which source level statement computed 
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Program 


Source 

length 

(lines) 


1 Unoptimized Code| 


1 Optimized Code | 


1 COP 1 


(CPU) 


annotated 

(CPU) 


(CPU) 


annotated 

(CPU) 


(CPU) 


(response 

time) 


wc 


338 


00:00.26 


00:02.16 


00:00.18 


00:01.86 


00:30.29 


00:53.33 


yacc 


59 


00:01.10 


00:06.38 


00:00.98 


00:05.84 


01:06.95 


01:34.33 


g° 


28547 


00:01.43 


00:08.36 


00:01.38 


00:08.53 


01:41.34 


02:18.82 


mSSksim^ 


17939 


00:29.62 


03:08.15 


00:24.92 


03:07.39 


41:15.92 


48:59.29 


compress^ 


1438 


00:00.20 


00:02.91 


00:00.17 


00:02.89 


00:52.09 


01:22.82 


li^ 


6916 


01:00.25 


05:42.39 


00:55.15 


05:32.32 


99:51.17 


123:37.67 


ijpeg" 


27848 


00:22.53 


02:35.22 


00:20.72 


02:33.98 


38:32.45 


57:30.74 



^ Spec95 benchmark test input set was used. 



Table 1. Execution Times (minutes:seconds) 



the value, the optimizations applied to the statement, and which statements in 
the unoptimized and optimized assembly code computed the values. 

In terms of cost, we were interested in the slow downs of the unoptimized and 
optimized programs and the speed of the comparison checker. COP performs on- 
the-fly checking during the execution of both programs. Both value and address 
comparisons are performed. In our experiments, we ran COP on an HP 712/100 
and the unoptimized and optimized programs on separate SPARC 5 workstations 
instead of running all three on the same processor as described in Scetion 3. 
Messages are passed through sockets on a 10 Mb network. A buffer is used to 
reduce the number of messages sent between the executing programs and the 
checker. We ran some of the integer Spec95 benchmarks as well as some smaller 
test programs. 

Table 1 shows the CPU execution times of the unoptimized and optimized 
programs with and without annotations. On average, the annotations slowed 
down the execution of the unoptimized programs by a factor of 8 and that of 
the optimized programs by a factor of 9. The optimized program experiences 
greater overhead than the unoptimized program because more annotations are 
added to the optimized program. 

Table 1 also shows the CPU and response times of COP. The performance 
of COP depends greatly upon the lengths of the execution runs of the programs. 
Comparison checking took from a few minutes to a few hours in terms of CPU 
and response times. These times are clearly acceptable if comparison checking is 
performed off-line. We found that the performance of the checker is bounded by 
the processing platform and speed of the network. A faster processor and 100 
Mb network would considerably lower these times. In fact, we ran COP on a 333 
MHz Pentium Pro processor and foTind the performance to be on average 6 times 
faster in terms of CPU time. We did not have access to a faster network. We 
measured the pool size during our experiments and found it to be fairly small. 
If addresses are not compared, the pool size contains less than 40 values for all 
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programs. If addresses are compared, then the pool size contains less than 1900 
values. 



8 Related Work 

The problem of debugging optimized code has long been recognized [13,20], with 
most of the previous work focusing on the development of source level debug- 
gers of optimized code [13,10,25,19,7,12,14,4,18,5,6,23,3] that use static analysis 
techniques to determine whether expected values of source level variables are 
reportable at breakpoints. Mappings that track an unoptimized program to its 
optimized program version are statically analyzed to determine the proper place- 
ment of breakpoints as well as to determine if values of source level variables 
are reportable at given breakpoints. However, all values of source level vari- 
ables are not reportable and optimizations are restricted. One reason is that 
the mappings are too coarse in that statements (and not instances) from the 
unoptimized program are mapped to corresponding statements in the optimized 
program. Another reason is that the effectiveness of static analysis in reporting 
expected values is limited. Recent work on source level debuggers of optimized 
code utilizes some dynamic information to provide more of the expected behav- 
ior of the unoptimized program. By executing the optimized code in an order 
that mimics the execution of the unoptimized program, some values of variables 
that are otherwise not reportable by other debuggers can be reported in [24]. 
However, altering the execution of the optimized program masks certain user 
and optimizer errors. The technique proposed in [8] can also report some val- 
ues of variables that are not reportable by other debuggers by timestamping 
basic blocks to obtain a partial history of the execution path, which is used to 
precisely determine what variables are reportable at breakpoints. In all of the 
prior approaches, limitations have been placed on the debugging system by ei- 
ther restricting the type or placement of optimizations, modifying the optimized 
program, or inhibiting debugging capabilities. These techniques have not found 
their way into production type environments, and debugging of optimized code 
still remains a problem. 

In [16], we developed a mapping technique for associating corresponding 
statement instances after optimization and describe how to generate mappings 
that support code optimized with classical optimizations as well as loop trans- 
formations. Since wc track corresponding statement instances and use dynamic 
information, our checker can automatically detect and report the earliest source 
statement that computes a different value in the optimized code as well as report 
the optimizations applied to that statement. 

For understanding optimized code, [22] creates a modified version of the 
source program along with annotations to display the effects of optimizations. 
Their annotations are higher level than ours as their focus is on program under- 
standing rather than comparison checking. 

The goal of our system is not to build a debugger for optimized code but 
to verify that given an input, the semantic behaviors of both the unoptimized 
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and optimized programs are the same. The most closely related work to our 
approach is Guard [21,2,1], which is a relative debugger, but not designed to 
debug optimized programs. Using Guard, users can compare the execution of 
one program, the reference program, with the execution of another program, the 
development version. Guard requires the user to formulate assertions about the 
key data structures in both versions which specify the locations at which the 
data structures should be identical. The relative debugger is then responsible 
for managing the execution of the two programs and reporting any differences in 
values. The technique does not require any modifications to user programs and 
can perform comparisons on-the-fly. The important difference between Guard 
and COP is that in Guard, the user essentially has to manually insert all of 
the mappings and annotations, while this is done automatically in COP. Thus 
using COP, the optimized program is transparent to the user. We also are able 
to check the entire program which would be difficult in Guard since that would 
require the user inserting all mappings. In COP, we can easily restrict checking 
to certain regions or statements as Guard docs. We can also report the particular 
optimizations involved in producing erroneous behavior. 

The Bisection debugging model [11] was also designed to identify semantic 
dilferences between two versions of the same program, one of which is assumed 
to be correct. The bisection debugger attempts to identify the earliest point 
where the two versions diverge. However, to handle the debugging of optimized 
code, the expected values of source level variables must be reportable at all 
breakpoints. This is not an issue in COP because expected values of source level 
variables are available at comparison points, having been saved before they are 
overwritten. 



9 Conclusion 

Optimizations play an important role in the production of quality software. Each 
level of optimization can improve program performance by approximately 25%. 
Moreover, optimizations are often required because of the time and memory con- 
straints imposed on some systems. Thus, optimized code plays, and will continue 
to play, an important role in the development of high performance systems. 

Our work adds a valuable tool in the toolkit of software developers. This tool 
verifies that given an input, the semantic behaviors of both the unoptimized and 
optimized program versions are the same. When the behaviors differ, information 
to help the programmer locate the cause of the differences is provided. Even when 
the outputs generated by the unoptimized and optimized programs are correct, 
comparing the internal behaviors of the unoptimized and optimized programs 
can help programmers detect errors in the original program. We implemented 
this tool, which executes the unoptimized and optimized versions of G programs, 
and ran experiments that demonstrate the approach is effective and practical. 

Our checking strategy can also be used to check part of the execution of an 
application program. For example, during testing, programs are typically exe- 
cuted under different inputs, and checking the entire program under every input 
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may be redundant and unnecessary. Our tool can check regions of the application 
program, as specified by the user. In this way, the technique can scale to larger 
applications. Furthermore, the mapping and annotation techniques ean be used 
in different software engineering applications. For example, optimized code is 
difficult to inspect in isolation of the original program. The mappings enable the 
inspection of optimized code guided by the inspection of the unoptimized code. 
Also, the same types of annotations can be used for implementing a source level 
debugger that executes optimized code and saves runtime information so that 
all values of source level variables are reportable. We are currently developing 
such a debugger. 
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Abstract. In this paper, we describe a testing technique, called structural 
specification-based testing (SST), which utilizes the formal specification of 
a program unit as the basis for test selection and test coverage measurement. 
We also describe an automated testing tool, called ADLscope, which sup- 
ports SST for program units specified in Sun Microsystems’ Assertion Defi- 
nition Language (ADL). ADLscope automatically generates coverage condi- 
tions from a program's ADL specification. While the program is tested, 
ADLscope determines which of these conditions are covered by the tests. An 
uncovered condition exhibits aspects of the specification inadequately exer- 
cised during testing. The tester uses this information to develop new test data 
to exercise the uncovered conditions. 

We provide an overview of SST’s specification-based test criteria and de- 
scribe the design and implementation of ADLscope. Specification-based 
testing is guided by a specification, whereby the testing activity is directly re- 
lated to what a component under test is supposed to do, rather than what it 
actually does. Specification-based testing is a significant advance in testing, 
because it is often more straightforward to accomplish and it can reveal fail- 
ures that are often missed by traditional code-based testing techniques. As an 
initial evaluation of the capabilities of specification-based testing, we con- 
ducted an experiment to measure defect detection capabilities, code coverage 
and usability of SST/ADLscope; we report here on the results. 
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1 Introduction 

Specification-based testing is guided by a specification, and thus the testing activity is 
directly related to what a component under test is supposed to do, rather than what it 
actually does. Specification-based testing is often more straightforward to accomplish 
because it is based on a black box description of the component being tested, whereas 
code-based techniques require detailed knowledge of the implementation. In addition, 
it can reveal failures that are often missed by traditional code-based techniques. 

In a previous publication [6], we presented preliminary ideas on a testing technique 
called structural specification-based testing (SST), a testing technique based upon ax- 
iomatic-style specifications. We also described a tool that supports SST for programs 
specified with the Assertion Definition Language (ADL)'. Since then, we have made 
improvements and revisions to the coverage criteria defined by SST. In particular, we 
now use simplified criteria that do not combine coverage conditions. This modification 
has made SST much more tractable. We have also re-designed and re-implemented the 
necessary tool support, collectively called ADLscope, for the new approach. Further- 
more, we have conducted an experiment showing that the new coverage criteria provide 
more than sufficient defect detection and code coverage and that the technique is prac- 
tical and usable. 

We summarize SST/ADLscope’s contributions as follows: 

• SST/ADLscope performs formal specification-based testing. SST defines what 
should be tested (test criteria) based on the program’s formal specification. 
ADLscope guides the testing activity based on ADL specifications of the program 
to be tested. 

• SST/ADLscope measures test coverage. SST defines specification-based test cov- 
erage criteria. ADLscope measures coverage with respect to a program’s ADL 
specification. Measuring test coverage provides testers an estimate of test adequa- 
cy. The advantage of ADLscope over other test coverage tools is that ADLscope 
measures specification coverage as opposed to code coverage. 

• SST/ADLscope facilitates test selection. Any uncovered coverage condition sug- 
gests that certain aspects of the specification have not been tested adequately. 
This forces the tester to select additional test data to cover those aspects. 

• SST/ADLscope facilitates unit testing. Although system testing is important, 
alone it is insufficient. Moreover, in practice, it often involves just “poking- 
around.” Unit testing is more rigorous and is also essential to assuring quality of 
reusable software packages. 

• SST/ADLscope facilitates API testing. SST/ADLscope is most useful for con- 
formance testing of an API (application programming interface). Code-based 
coverage techniques are generally not applicable in conformance testing since 
higher code coverage does not imply that an implementation actually conforms to 
its specification. Moreover, many code-based coverage techniques require the 



1 . ADL is a formal specification language developed at Sun Microsystems Laboratories 
to describe behavior of functions in the C programming language. 




Automated Support and Experimental Evaluation 287 



source code, which is often not available for conformance testing because of pro- 
prietary issues. ADLscope, on the other hand, only requires the object code or 
compiled libraries. 

A large portion of this paper is devoted to describing and reporting results of an ex- 
periment we carried out to measure defect detection, code coverage, and usability of 
SST/ADLscope. We feel that validation is critical to proposing new technologies and 
that this justifies our emphasis in this paper. We plan to conduct further studies of spec- 
ification-based testing, both for other new methods as well as for SST/ADLscope. For 
the time being, we believe that our experimental results for SST/ADLscope bode well 
for specification-based testing, in general. 

Section 2 provides an overview of ADL and its companion tool, the ADL Translator 
(ADLT). Section 3 describes the design and implementation of ADLscope. Section 4 
presents the coverage criteria defined by SST for ADLscope. Section 5 describes our 
experiment. Section 6 discusses related work in this area of research. Section 7 summa- 
rizes our contributions and discusses future work. 

2 Overview of ADL and ADLT 

2.1 ADL 

ADL is an axiomatic specification language by which the intended behavior of a pro- 
gram is specified by assertions describing each of the program’s C functions individu- 
ally. The assertions are based on first order predicate logic and use the same type system 
as C. Sharing the same type system as C allows tools to easily map an entity in the im- 
plementation domain to the specification domain, and vice versa. 

An example ADL specification appears in the upper right window of the browser 
in Figure 2. We refer the reader to [28] for more information. 

2.2 ADLT 

ADLT, developed at Sun Microsystems Laboratories, is a tool that compiles an ADL 
specification and a high-level description of test data into a test program. The generated 
test program embeds an automated test oracle that reports to the tester whether an im- 
plementation conforms or violates its ADL specifications for a given test case. We pro- 
vide a brief description of ADLT here. Refer to [25] for more details. 

ADLT is used in two stages; the compile stage and the run stage. In the compile 
stage, ADLT takes a function’s ADL specification and its test data description (TDD) 
as input. TDD is a high-level specification of the test data. It allows the tester to specify 
the intention, the data types, and the enumeration of test data without supplying the ac- 
tual values for the test data. The tester can provide the actual values for the test data in 
later stages of the testing process. ADLT generates assertion checking functions as out- 
put. Assertion checking functions are C functions that evaluate the ADL specification 
to determine whether a test has conformed or violated the specification; they serve the 
purpose of the test oracle. The assertion checking functions, the functions under test, 
and all their auxiliary functions are compiled and linked into a test program. 
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In the run stage, the user invokes the test program to test the intended software. The 
test program utilizes the functionality provided by the ADLT runtime library to check 
assertions and report results to the tester. 

3 ADLscope Design and Implementation 

We have developed structural specification-based testing for ADL. ADLscope is the 
collection of components that we have integrated with ADLT to measure coverage of 
structural specification(ADL)-based test criteria. This section describes the design and 
implementation of ADLscope. 

The ADLscope release (including ADLT) can be downloaded from 
www.ics.uci.edu/~softtest/adlscope. The release contains the source files as well as ex- 
ecutables for Solaris and Linux. The original ADLT system and documentation can be 
downloaded from [29] . 

3.1 Integration with ADLT 

ADLscope is tightly integrated with ADLT. Figure 1 shows the dataflow of ADLscope 
and ADLT. 

In the compile stage, ADLscope generates a set of coverage checking functions, 
which are C functions that evaluate coverage conditions during the run stage. The cov- 
erage checking functions are linked into the test program. ADLscope also generates 
static coverage data, which store information about each coverage condition such as to 
which function it belongs and from where it is derived in the source. It also contains the 
name of the ADL specification, the compilation time, and the path of the ADL specifi- 
cation file. This information is later used by the ADLscope coverage browser or report 
generator to display coverage results to the tester. 

In the run stage, the test program calls the coverage checking functions which de- 
termines whether coverage conditions have been satisfied. The ADLscope runtime li- 
brary is used to store this dynamic coverage data, which maintains a count for each cov- 
erage condition indicating the number of times the condition is covered as well as re- 
cording the run time and associating the name of the test. 

ADLscope requires no additional work from the user. The user needs only supply 
one additional flag - cov on the ADLT command line to enable coverage measurement. 
The bulk of ADLscope (with the exception of the database API and the coverage brows- 
er, which are described in the following subsections) is developed in C and C-H-. 

3.2 Database API 

We have developed a Java API for reading both static coverage data and dynamic cov- 
erage data from the ADLscope database. This API is used by the ADLscope coverage 
browser to display coverage information through a graphical user interface. It can also 
be used by users to generate customized coverage reports. 
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Eigure 1 : Dataflow of ADLscope and ADLT integration. 



3.3 Coverage Browser 

The ADLscope coverage browser is a hyper-linked browser. By clicking on a function 
or a coverage condition, the browser displays coverage information associated with that 
function or coverage condition as well as its location in the specification. This allows 
the tester to quickly focus on the part of the specification that requires more testing. 

The ADLscope coverage browser is implemented in Java to take advantage of 
Java’s portability and platform independence. Most of the GUI components are based 
on Java Foundation Classes. Figure 2 shows a screenshot of the coverage browser. 

4 Specification-Based Coverage Criteria 

In SST, the coverage criteria used are associated with the expressions of the specifica- 
tion language. Some of these criteria are based on existing test selection strategies; for 
instance, the multicondition strategy [19] is used for logical expressions, and weak mu- 
tation testing [14] is applied to relational expressions. We have also developed custom- 
ized strategies for the following constructs that are particular to ADL: conditional ex- 
pressions, implication expressions, relational expressions, equality expressions, nor- 
mally expressions, group expressions, unchanged expressions, and quantified 
expressions. Refer to [5] for details on the coverage criteria for these types of expres- 
sions. 
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Figure 2: ADLscope coverage browser. 



5 Experimental Evaluation 

We conducted an experiment to evaluate SST/ADLscope along several measures, in- 
cluding defect detection, code coverage, and usability. This section describes the exper- 
iment and its goals, summarizes the characteristics of the subject programs, and reports 
our findings. The main goal of our experiment is to evaluate the defect detection capa- 
bilities of SST/ADLscope. A secondary goal is to analyze the usability of the SST tech- 
nique and the ADLscope tool. Since this technology is new, it is important to determine 
whether it is usable in practice. Usability is categorized into three attributes: tractability, 
satisfiability, and process practicality. 

Mutation testing [3, 7, 13] is a technique for estimating the defect detection capa- 
bility of a set of test data. In mutation testing, a test suite is judged by its ability to detect 
intentionally seeded faults (called mutations). The assumption is that if a test suite ad- 
equately detects the mutations, it is then adequate for detecting real faults ^ To measure 
defect detection, ADLscope was used to generate coverage conditions for the subject 
programs and test data were developed to satisfy the coverage conditions. Their adequa- 
cy for testing the implementation were evaluated according to mutation testing metrics. 



1 . Arguments supporting this claim have been made by the authors of mutation testing [ 1 , 
7, 20]. 
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Our hypothesis here is that the test data required by SST/ADLscope are relatively “mu- 
tation-adequate” and hence good at defect detection. 

By tractability, we mean that the number of test cases required by specification- 
based coverage criteria (SST/ADLscope) should be reasonable (practical and manage- 
able) relative to the size of the program being tested. One straight-forward test selection 
strategy would be to consider all possible combinations of input and/or coverage con- 
ditions. Yet, this often results in combinatorial explosion, where it is infeasible either 
for the tester to compute valid input values for each combination or for the machine to 
execute all combinations. Moreover, many combinations are neither meaningful nor in- 
teresting; for example, it may not be desirable to combine an exceptional condition with 
all possible normal conditions. Thus, an important aspect of a test selection strategy is 
its tractability — it should never require an impractical or unmanageable number of test 
cases. Our hypothesis here is that SST/ADLscope does not require an excessive number 
of coverage conditions (and hence test data) relative to the size of the subject programs, 
and hence the technology is tractable. 

By satisfiability, we mean that the ratio of unsatisfiable coverage conditions gener- 
ated by SST/ADLscope should be low. As is the case with code coverage criteria (such 
as branch coverage, dataflow coverage, and mutation coverage) \ specification-based 
coverage criteria can “require” conditions that cannot be satisfied by an implementation 
under any valid input conditions. Unsatisfiable coverage conditions are not flaws in the 
coverage criteria, but are often the result of particular features of the specification or im- 
plementation, such as the logical definition and order of expression evaluation. If the 
number of unsatisfiable coverage conditions is high, the tester would be forced to spend 
too much time determining [un] satisfiability of uncovered conditions relative to suc- 
cessfully finding data to satisfy the conditions; this would make SST/ADLscope effec- 
tively unusable. Our hypothesis here is that SST/ADLscope does not produce an exces- 
sive number of unsatisfiable coverage conditions relative to the total number of cover- 
age conditions, and hence the technology is satisfiable. 

By process practicality, we mean that SST/ADLscope is usable in a reasonable test- 
ing process. In practice, testers often initiate the testing process with functional testing 
and then move on to structural testing to cover those parts of the source code that have 
not yet been covered. In structural testing, a coverage condition defines how a source 
code fragment can be executed. Structural testing is quite expensive; it is difficult and 
labor-intensive to find input values that traverse a path reaching the source code frag- 
ment to be covered and then satisfy the coverage condition. On the other hand, most 
functional test data can be developed with an adequate understanding of the functional- 
ity of the system, hence the name black box testing. Therefore, beginning with function- 
al testing and augmenting with structural testing makes the testing process more prac- 
tical, especially if the functional tests have achieved high code coverage so that the ef- 
fort to conduct structural testing is minimized. SST/ADLscope aids the tester/user 
during functional testing by providing information about the completeness of the test 
suite and thus reduces the cost of structural testing. To measure process practicality. 



1 . There is generally a lack of experimental research to measure satisfiability quantitative- 
ly- 
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therefore, we studied the relationship between specification-based (functional) testing, 
as defined and measured by SST/ADLscope, and implementation-based (structural) 
testing, in this case branch coverage. Our hypothesis here is that the test data required 
by ADLscope consistently achieve high code coverage, and hence the test process is 
practical. 

5.1 Programs under Evaluation 

The programs under evaluation include programs from the research literature, as 
well as several APIs commonly used by programmers. Overall, 1 17 functions from ten 
programs are evaluated. The ADL specification of these functions occupies 2,477 lines 
of text. The implementation occupies 2,908 lines of text. Table 1 provides more detailed 
information about the programs under evaluation. 



Table 1: Programs under evaluation. 



Programs 


# Public 
functions 


# Lines of 
specification 


# Lines of 
code 


BitSet 


12 


150 


343 


Hashtable 


14 


217 


380 


String 


29 


509 


648 


StringBuffer 


20 


407 


348 


Vector 


25 


323 


323 


Astring 


5 


463 


250 


Bank 


2 


86 


128 


Elevator 


1 


105 


254 


Queue 


4 


127 


173 


Stack 


5 


90 


61 


Total 


117 


2,477 


2,908 



Examples from the literature include a stack ADT, a queue ADT, the bank example, 
the elevator example, and a string package. The specification and the implementation 
of the queue ADT, the bank example, and the string package are provided with the 
ADLT release [29] for demonstration purposes. The specification and implementation 
of the stack ADT and the elevator example are derived from descriptions in the litera- 
ture [23, 33]. 

Several classes from the Java Core API have been included; the standard Java string 
class, the string buffer class, the vector class, the bit set class, and the hash table class. 
The Java implementation of these classes was translated into equivalent C programs, 
which was quite straightforward since Java and C share many similar constructs. When 
constructs native to Java are encountered, a set of rules is used to provide a consistent 
translation into C constructs. The ADL specification of these Java classes are developed 
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based on the Java Language Specification [12], which is the official documentation of 
the Java language and the Java Core API. This documentation is in English and does 
not use any formal notations, but is relatively unambiguous. 

5.2 Test Data Selection 

To select input values from coverage conditions, the following rules are used: (1) for 
each function, select a minimal set of values that would cover all satisfiable coverage 
conditions of that function, and (2) do not use special values and extreme values unless 
required by a coverage condition. Both rules are intended to avoid bias, the motivation 
being to avoid introducing test data that are not required by the coverage criteria and 
might have inflated results. Our goal in selecting test data is to choose values that might 
be selected by any tester utilizing only the information provided by ADLscope. 

Table 2 shows the number of test data that are used in the experiment for each pro- 
gram under evaluation. As previously mentioned, the input values are used to measure 
code coverage and project defect detection capabilities. Details and results are provided 
in the following sections. 



Table 2: Specification-based test data. 



Programs 


# Test data 


Average per 
function 


BitSet 


19 


1.58 


Hashtable 


31 


2.21 


String 


111 


3.83 


StringBuffer 






Vector 


52 




Astring 


25 




Bank 


9 


4.50 


Elevator 


11 


11.0 


Queue 


6 


1.50 


Stack 


9 


1.80 


Total 


323 


2.76 



5.3 Defect Detection 

Mutation testing is a technique for estimating the defect detection capabilities of a test 
data set. Mutation testing is based upon a set of mutation transformation rules. When 
applied to a program, the rules introduce simple syntactic changes or “mutations” into 
the program, which are treated as seeded faults. Typical transformation rules replace an 
operator with another or alter a variable name in an expression with another variable of 
compatible type. The original program and the mutated program, or “mutant”, are run 
with the same test suite. A mutant is said to be “killed” if it produces output different 
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from that produced by the original program. The ratio of killed mutants to the total 
number of mutants is determined and the test data set considered “mutation-adequate” 
if it has killed a sufficient portion of the mutants. The more mutants killed, the more 
mutation-adequate and better quality of the test data set. The basic idea is that a muta- 
tion-adequate test set, which detects enough of the seeded faults, is likely to detect nat- 
urally occurring defects. Mutation testing is used in our experiment to measure defect 
detection capabilities of SST/ADLscope. ^ 

Weak mutation testing [14] is a variation of [strong] mutation testing. Whereas mu- 
tation testing requires the output between the original program and the mutated program 
to be different, weak mutation testing only requires the intermediate values computed 
by the mutated code fragment to be different (typically at the statement level). Weak 
mutation testing is weaker than [strong] mutation testing in the sense that even though 
the mutated code fragment may compute different values from the original fragment, 
the mutated program may still compute the same output as the original program. Thus, 
weak mutation coverage of a test suite is always higher than its mutation coverage. 

Weak mutation testing has the advantage, however, that it is more easily automated 
through the use of source code instrumentation. It also requires less computation time, 
since the instrumented program is only run once for each test case whereas mutation 
testing requires every mutated program and the original program to be run for each test 
case. Weak mutation testing was used in our experiment, primarily because tool support 
for weak mutation testing is more readily available. The lower computing cost is also a 
consideration. 

Two weak mutation metrics are used in the experiment; operator coverage and op- 
erand coverage. In operator coverage, every possible mutation that replaces one opera- 
tor with another must be killed. The basic idea is to detect defects where the program- 
mer used the wrong operator.In operand coverage, every mutation that replaces one op- 
erand with another operand of compatible type must be killed. The basic idea is to detect 
defects where the programmer used the wrong operand. 

We measured that specification-based test data derived from ADLscope coverage 
conditions achieved 77.3% (1,571 out of 2,033) operator coverage and 69.0% (4,791 
out of 6,941) operand coverage. Table 3 shows operator coverage of specification- 
based test data applied to the implementation of each program under evaluation. Table 
4 shows operand coverage of specification-based test data. The data show that ADLs- 
cope is quite effective at revealing operator mutants.^ 



1 . We rely on mutation testing researchers’ justification for predicting defect detection but 
provide an overview of their assumptions here. The “competent programmer hypothesis” 
states that programmers only makes small syntactic errors and not complex logical, de- 
sign, and control errors [1], and thus the code is relatively close to being correct. The 
“coupling effecf’ states that complex faults are coupled to simple faults and test data de- 
tecting simple faults are sensitive enough to detect more complex faults [7, 20]. The cou- 
pling effect justifies the utility of mutation testing, since the capability to detect simple 
mutations is, therefore, indicative of the capability to detect complex faults. 

2. Lack of similar experiments for other specification-based testing techniques prevents 
us from making a meaningful comparison between ADLscope/SST and those techniques. 
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Table 3: Operator coverage of SST/ADLscope test data. 



Programs 


# Operator 
Conditions 


# Covered 
Conditions 


Percentage 


BitSet 


221 


169 


76.5% 


Flashtable 


291 


192 


66.0% 


String 


607 


510 


84.0% 


StringBuffer 


346 


272 


78.6% 


Vector 


254 


196 


77.2% 


Astring 


112 


88 


78.6% 


Bank 


54 


36 


66.7% 


Elevator 


100 


69 


69.0% 


Queue 


28 


22 


78.6% 


Stack 


20 


17 


85.0% 


Total 


2,033 


1,571 


77.3% 



Table 4: Operand coverage of SST/ADLscope test data. 



Programs 


# Operand 
Conditions 


# Covered 
Conditions 


Percentage 


BitSet 


557 


362 


65.0% 


Flashtable 


843 


491 


58.2% 


String 


2,932 


2,343 


80.0% 


StringBuffer 


1,220 


811 


66.5% 


Vector 


635 


390 


61.4% 


Astring 


300 


139 


46.3% 


Bank 


88 


58 


65.9% 


Elevator 


288 


161 


55.9% 


Queue 


54 


28 


51.9% 


Stack 


24 


8 


33.3% 


Total 


6,941 


4,791 


69.0% 



The low operand coverage percentages are a bit misleading. Operand coverage re- 
quires any two variables of compatible types to have different values. For example, n 
integer constant used as an error code has to be distinguished from a loop index. It is 
certainly possible, although unlikely, for a programmer to make a mistake and misuse 
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a loop index variable in place of an error code. If a function uses, say, five error codes 
and three loop indices, then each occurrence of a loop index would have to be distin- 
guished from the two others (perhaps reasonable) as well as from the five error codes 
(possible overkill). This is especially severe in C, a programming language with a weak 
typing system. In languages with strong typing systems such as Ada, error codes and 
loop indices are typically incompatihle types and thus a loop index could not he re- 
placed hy an error code, and vice versa, and no mutant operands would be required to 
distinguish error codes from loop indices. We speculate ADLscope would achieve 
greater operand coverage for programs written in strongly-typed languages. 

5.4 Tractability 

Testing techniques must he tractable - that is, the number of test data required or de- 
rived using the technique must not be overly excessive relative to the size of the pro- 
gram. Many forms of path testing, for example, are not tractable because they require 
covering an unreasonable number of paths and cannot be satisfied with tractable re- 
source. For SST/ADLscope to be tractable, it must not derive an unmanageable number 
of coverage conditions since the tester would spend too much effort computing test data 
for the coverage conditions. A related usability attribute is scalability, which might be 
defined as the ability of a technique to continue to be tractable with increase in size or 
complexity of the subject programs. In fact, because SST and ADLscope are geared to- 
ward testing functions, we believe that the tractability results here would hold for very 
large programs, and hence SST/ADLscope is also scalable. 

The number of coverage conditions generated by ADLscope was recorded for each 
function and for each program under evaluation. Table 5 shows the size of the specifi- 
cation and the number of coverage conditions for each program required by ADLscope. 



Table 5: Tractability of SST/ADLscope. 



Programs 


# Lines of 
specification 


# Coverage 
conditions 


Average per 
function 


BitSet 


150 


117 


9.75 


Hashtable 


217 


147 


10.5 


String 


509 


635 


21.9 


StringBuffer 


407 


385 


19.3 


Vector 


323 


225 




Astring 


463 


165 




Bank 


86 


51 


25.5 


Elevator 


105 


121 


121 


Queue 


127 


29 


7.25 


Stack 


90 


44 


8.80 


Total 


2,477 


1,802 


15.4 
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On average, ADLscope generates 15.4 coverage conditions per function. The larg- 
est number of coverage conditions generated for a function is 121. Detailed data are 
provided in [4]. All the data show that the number of coverage conditions is not exces- 
sive and can be managed by a single tester. 

The number of test data required to cover generated coverage conditions is also a 
consideration for tractability. The data provided in Table 2 show that the number of test 
data for each function is also quite manageable. 

5.5 Satisfiability 

In most implementation-based testing techniques, it is possible to generate unsatisfiable 
coverage conditions. These are usually consequences of infeasible combinations of 
branches, logical conditions, and dependence relationships. Likewise, the coverage 
conditions generated by ADLscope can be unsatisfiable by an implementation. The 
specification itself as well as its style, completeness and correctness can all contribute 
to unsatisfiable coverage conditions as can factors of the implementation. 

It is critical that a low portion of the coverage conditions be unsatisfiable. If the per- 
centage of unsatisfiable coverage conditions is high, the user will waste too much time 
determining whether or not uncovered coverage conditions are satisfiable rather than 
focusing on finding test data. 

For the programs under evaluation, 6.1% of the coverage conditions generated by 
ADLscope are unsatisfiable. It can be safely concluded that unsatisfiable coverage con- 
ditions are not a hindrance to SST/ADLscope’s usage. Table 6 shows the number and 
percentage of unsatisfiable conditions for each program under evaluation. We refer the 
reader to [4] for further discussion on satisfiability. 



Table 6: Satisfiability of SST/ADLscope coverage conditions. 



Programs 


# Coverage 
Conditions 


# Unsatisfiable 
Conditions 


Percentage 


BitSet 


117 


12 


10.3% 


Hashtable 


147 


0 


0.0% 


String 


635 


42 


6.6% 


StringBuffer 


385 


18 


4.7% 


Vector 


225 


18 


8.0% 


Astring 


165 


17 


10.3% 


Bank 


51 


2 


3.9% 


Elevator 


121 


0 


0.0% 


Queue 


29 


1 


3.4% 


Stack 


44 


0 


0.0% 


Total 


1,802 


110 


6.1% 
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5.6 Process Practicality 

Measuring code coverage is a technique commonly used by testers to obtain an estimate 
of the amount of additional testing required and to determine the quality of an existing 
test suite. As discussed previously, a typical approach is to begin with functional testing 
and then augment the test data set as needed using structural testing techniques to satisfy 
code coverage. In this experiment, each program under evaluation is tested with speci- 
fication-based test data. Branch coverage, which is a common requirement of standard- 
ized test processes, is measured by GCT [18]. 

We measured that 469 out of 540 branches (or 86.9%) are covered by the specifi- 
cation-based test data. Table 7 breaks down the branch coverage by the programs under 
evaluation. 



Table 7: Branch coverage of SST/ADLscope test data. 



Programs 


# Branches 


# Covered 
Branches 


Percentage 


BitSet 


60 


42 


70.0% 


Hashtable 


72 


57 


79.2% 


String 


166 


158 


95.2% 


StringBuffer 


98 


80 


81.6% 


Vector 


68 


63 


92.7% 


Astring 


40 


34 


85.0% 


Bank 


10 


10 


100.0% 


Elevator 


14 


14 


100.0% 


Queue 


6 


5 


83.3% 


Stack 


6 


6 


100.0% 


Total 


540 


469 


86.9% 



Uncovered branches are usually the result of implementation details that are not 
(and should not) be described in the specification. Special cases dealing with such is- 
sues are not covered by specification-based test data because they are not described 
even remotely in the specification. 

6 Related Work 

Table 1 shows a matrix that summarizes the comparison of SST/ADLscope with re- 
lated specification-based test selection techniques. Condition tables [11] and category 
partitioning [21] are techniques to be carried out by human testers. SITE [16], ASTOOT 
[8], and Daistish [15] are tools that support automated test selection from algebraic 
specifications. The multicondition strategy [19], the meaningful impact strategy [10, 
30, 31], and the strict ordering strategy [9] are methods for selecting test cases from 
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simple boolean expressions. SoftTest [2] is a tool that implements the meaningful im- 
pact strategy. Revealing subdomains [32], partition analysis [24], and TTF [27] are 
techniques based on axiomatic specifications, whose automation are intractable. As far 
as we know, ADLscope is the only automated and tractable axiomatic specification- 
based test selection tool. A recent survey of automated specification-based testing tech- 
niques [22] did not mention any comparable techniques and tools. 



Table 8: Specification-hased test selection techniques. 





Automated 


Automatable 


Manual/Intractable 


Informal 

specifications 






Condition table [11], 
Category partitioning 
[21] 


Algebraic 

specifications 


SITE [16], 
ASTOOT [8], 
Daistish [15] 






Boolean 

expressions 


SoftTest [2] 


Multicondition [19], 
Meaningful impact 
[10, 30,31], 

Strict ordering [9] 




Axiomatic 

specifications 


SST/ADLscope 




Revealing subdomains 
[32], 

Partition analysis [24], 
TTF [27] 



The are several reasons that we are able to develop automated support for our test 
selection technique for axiomatic specifications: 

• Most manual approaches advocate combining conditions, which results in com- 
binatorial explosion. ADLscope uses “localized” coverage criteria that do not 
combine conditions and thus is scalable. Theoretically, SST does not provide as 
thorough coverage as combined conditions would; even so, our experimental re- 
sults show that it achieves very good defect detection and code coverage. 

• Several approaches require applying symbolic evaluation to the specification to 
find “paths” in the specification. Symbolic evaluation requires complex tool sup- 
port and expensive computing resources. Even with tool support, symbolic eval- 
uation must be carried out with user interaction to cut down the number of paths 
to explore. SST/ADLscope uses a simpler approach that is similar to those used 
in common instrumentation-based code coverage tools. The experimental results 
indicate that SST/ADLscope is quite cost effective. We speculate that it is much 
more cost effective than any manual or symbolic evaluation-based techniques. 

• Many techniques are not automated because of the representation mapping prob- 
lem, which requires a mechanism for mapping entities in the specification to en- 
tities in the implementation and vice versa. For a system specified in Z and im- 
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plemented in Ada, for example, the relationships between the Z and Ada entities 
are needed. ADLscope takes advantage of the fact that ADL uses mappable data 
representation and control structure with respect to C. In terms of data represen- 
tation, ADL includes all the data types in the C language and adds a few new 
types that can be mapped easily to equivalent C types. In terms of control struc- 
ture, ADL specifications have only two states — the call state and the return state, 
corresponding to the state immediately before a function is called and the state 
immediately after a function returns in C programs, respectively. 

In some sense, automation is achieved in ADLscope at the expense of portability; ADL 
specifications must be used with C programs. Yet we believe the ideas transcend. 

7 Conclusion and Future Work 

SST/ADLscope has demonstrated satisfactory defect detection, achieving 77.3% oper- 
ator coverage and 69.0% operand coverage. SST/ADLscope has also demonstrated ex- 
cellent code coverage, achieving 86.9% branch coverage. This suggests that after spec- 
ification-based testing with SST/ADLscope, the tester only needs to provide test cases 
to cover one out of every seven branches to achieve full branch coverage. The usability 
of SST/ADLscope is also satisfactory, at least according to our experiment,. 

The main contribution of SST/ADLscope is that it provides a rigorous approach to 
unit and API testing by incorporating the specification, in a systematic manner, into the 
test selection process. The added rigor can only lead to improved overall system quality. 
Automation in the test selection process not only improves efficiency and productivity 
but is also less prone to human error. 

There are several possible extensions to the specification-based coverage metrics 
used in SST/ADLscope. Many test selection strategies have been proposed in the re- 
search literature. It is difficult to compare their effectiveness analytically, however, 
since most analysis techniques cannot take into account the specification paradigm, the 
particular application at hand, the completeness and correctness of the specification and 
the implementation, and many other issues. SST/ADLscope presents an opportunity to 
quantitatively measure defect detection capabilities of test selection strategies. 

The ideas embedded in SST/ADLscope can be applied to other specification lan- 
guages. ANNA [17] is very similar to ADL, providing axiomatic specification support 
for Ada programs; thus, developing something like ANNAscope for Ada should be 
straightforward. SST’s coverage criteria could be applied to Z [26], but the mapping 
problem cannot be avoided. 

Structural specification-based testing can also be applied to other specification par- 
adigms. Guards on state transitions in state-based languages are logical expressions, 
which are amenable to SST’s coverage criteria, as are the logic formulas in temporal 
logic languages. 

Finally, we would also like to expand the experiments to include larger projects and 
a greater variation of application domains. This will further help us evaluate SST/ 
ADLscope’ s weaknesses and strengths as well as further evaluate specification-based 
testing. 
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Abstract. Dynamic program slicing methods are widely used for debug- 
ging, because many statements can be ignored in the process of localizing 
a bug. A dynamic program slice with respect to a variable contains only 
those statements that actually had an influence on this variable. How- 
ever, during debugging we also need to identify those statements that 
actually did not affect the variable but could have affected it had they 
been evaluated differently. A relevant slice includes these potentially af- 
fecting statements as well, therefore it is appropriate for debugging. In 
this paper a forward algorithm is introduced for the computation of rel- 
evant slices of programs. The space requirement of this method does not 
depend on the number of different dynamic slices nor on the size of the 
execution history, hence it can be applied for real size applications. 

Keywords: Dynamic slicing, relevant slicing, debugging 



1 Introduction 

Program slicing methods are widely used for debugging, testing, reverse engi- 
neering and maintenance (e.g. [7], [17], [5], [8]). A slice consists of all statements 
and predicates that might affect the variables in a set V at a program point p 
[19]. A slice may be an executable program or a subset of the program code. In 
the first case the behaviour of the reduced program with respect to a variable 
V and program point p is the same as the original program. In the second case 
a slice contains a set of statements that might influence the value of a variable 
at point p. Slicing algorithms can be classified according to whether they only 
use statically available information [static slicing) or compute those statements 
which influence the value of a variable occurence for a specific program input 
[dynamic slice). 

In this paper we are concerned with using slicing methods in program debug- 
ging. During debugging we generally investigate the program behaviour under 
the test case that revealed the error, not under any generic test case. Therefore, 
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dynamic slicing methods are more appropriate than static ones. By using a dy- 
namic slice many statements can be ignored in the process of localizing a bug. 
Different dynamic slicing methods for debugging are introduced in e.g. [12], [11], 
[3] . In [3] Agrawal and Horgan presented a precise dynamic slicing method which 
is based on the graph representation of the dynamic dependences. This graph, 
called a Dynamic Dependence Graph (DDG) includes a distinct vertex for each 
occurrence of a statement. A dynamic slice created from the DDG with respect 
to a variable contains those statements that actually had an influence on this 
variable. (We refer to this slice as the DDG slice). However, during debugging we 
also need to identify those statements that actually did not affect the variable 
but could have affected it had they been evaulated differently. In the next section 
we are going to present a simple example, where the DDG slice with respect to 
a variable does not contain such statements which may affect the value of the 
variable. (We can say that the value of the variable is potentially dependent on 
those statements.) 

In [4] Agrawal et al. introduced a new type of slicing, called relevant slicing 
for incremental regression testing. The relevant slice is an extension of the DDG 
slice with potentially dependent predicates and their data dependences. They 
suggested relevant slicing to solve the problem of determining the test cases in a 
regression test suit on which the modified program may differ from the original 
program. Since a relevant slice with respect to a variable occurence includes 
all statements which may affect the value of the variable, and this slice is as 
precise as possible, therefore we would like to emphasize that relevant slices are 
very suitable for debugging as well. The DDG slice component of the relevant 
slice is precise, but to determine potential dependences we need to use static 
dependences as well. The static dependence may be imprecise for a program 
including e.g. unconstrained pointers, hence we may not obtain a precise relevant 
slice using imprecise static infomation. Note that not only the relevant slicing 
method requires static information. Any other algorithm for computing potential 
dependences has to use the same imprecise static dependences. 

In the method introduced in [4] the DDG representation is used for the 
computation of the relevant slice. The major drawback of this approach is that 
the size of the DDG is unbounded. Although Agrawal and Horgan suggested a 
method for reducing the size of the DDG [3], even this reduced DDG may be 
very huge for an execution which has many different dynamic slices. Therefore, 
the method of Agrawal et al. could not apply for real size applications where for 
a given test case millions of execution steps may be performed. 

In this paper we introduce a forward computation method for relevant slices. 
First, we give a forward algorithm to compute a dynamic slice which is equiv- 
alent to the DDG slice, then we augment this algorithm with the computation 
of the potential dependences. The main advantage of our approach is that the 
space requirement of this algorithms is very limited compared to that of Agrawal 
et al. In our case the space requirement for the computation of the dynamic 
slice is 0{m?), where m is the total number of the different variables, predicate 
statements and output statements in the program. To determine potential de- 
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pendences we also need the static Program Dependence Graph ([16], [9]), the 
size of which is O(rt^), where n is the number of the statements in the program. 
The space requirement of our approach does not depend on a test case, there- 
fore this method can be applied to compute relevant slices for such inputs which 
produce millions of execution steps. 

In the next section the basic concepts of dynamic and relevant slices are 
presented. Section 3 provides a detailed description of our algorithm. Section 4 
discusses relations to other work. We have already developed a small prototype 
for the algorithm presented in the paper. Section 5 summarizes our experience 
in this implementation and the future research is also highlighted. 



2 Dynamic and Relevant Slicing 

In some applications static program slices contain superfluous instructions. This 
is the case for debugging, where we have dynamic information as well (the pro- 
gram has been executed). Therefore, debugging may require smaller slices, which 
improves the cflicicncy of the bug revealing process ([2], [13]). The goal of the 
introduction of dynamic slices was to determine those statements more precisely 
that may contain program faults, assuming that the failure has been revealed 
for a given input. 

First, we briefly review the program representation called Program Depen- 
dence Graph (PDG) ([16], [9]). The PDG of a program contains one vertex for 
each statement and control predicate. There are two types of directed edges in 
the PDG: data dependence edges and control dependence edges. The graph has 
one data dependence edge from vertex n to vertex m if the statement at vertex 
m uses the value of a variable x, that is defined in the statement at vertex n 
and there is a path in the control-flow graph of the program from n to m along 
which X is never defined. There is a control dependence edge from a predicate 
vertex p to a vertex n if (1) p has two exits; and if (2) following one of the exits 
from p always results in n being executed, while taking the other exit may result 
in n not being executed. (We can say that the execution of n directly depends 
on the evaluation of predicate p.) In Fig. 1 we present a small example program 
and the PDG of this program. 

The static slice of a program can be easily constructed using a PDG. This 
slice with respect to a variable v at vertex n contains nodes whose execution 
may affect the value v at vertex n. In our example the static slice of the variable 
s at vertex 14 contains all statements of the program. 

Prior to the description of different dynamic slicing approaches some back- 
ground is necessary which is demonstrated using the example in Fig. 1. 

A feasible path that has actually been executed will be referred to as an ex- 
ecution history and denoted by EH. Let the input be a = 0, n = 2 in the case of 
our example. The corresponding execution history is (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 
13, 8, 9, 10, 12, 13, 8, 14). We can see that the execution history contains the in- 
structions in the same order as they have been executed, thus EH(j) gives the 
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1. read(n) 

2. read (a) 

3. X := 1 

4. b := a + X 

5. a := a + 1 

6. i := 1 

7. s := 0 

8. while i <= n do 

9. if b > 0 then 

10. if a > 1 then 

11. X := 2 

12. s := s + X 

13. i := i + 1 
endwhile 

14. write (s) 




Fig. 1. The PDG of a simple program 



serial number of the instruction executed at the jth step referred to as execution 
position j. 

To distinguish between multiple occurrences of the same instruction in the 
execution history we use the concept of action that is a pair (f,j) which is 
written down as V , where i is the serial number of the instruction at the execution 
position j . For example 14^® is the action for the output statement of our example 
for the same input above. 

The first precise definition of dynamic slices can be found in [12]. In that 
paper Korel and Laski defined dynamic slice as an executable program which can 
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be created from the original program by deleting some statements. It is required 
that for a given input the dynamic slice compute the same values as the original 
program for a variable v at some selected execution position. We can define the 
dynamic slicing criterion as a triple (x, where x denotes the input, i^ is 

an action in the execution history, and 17 ig a set of the variables. For a slicing 
criterion a dynamic slice can be defined as the set of statements which may affect 
the values of the variables in V. Considering this definition which is slightly 
different from the original [12] (but follows the concept of subsequent slicing 
definitions in different papers) the slice is usually not an executable program, 
but a subset of the original code. 

Agrawal and Horgan [3] defined dynamic slicing in a different way as follows. 
Given an execution history if of a program P for a test case t and a variable 
II, the dynamic slice of P with respect to H and v is the set of statements in H 
whose execution had some effect on the value v as observed at the end of the 
execution. 

Apart from the small difference that this definition considers only a single 
variable v, the main difference is that they consider this variable at the end of 
the execution history. T’hus we take no interest in the value of v except when it 
is observed for the last time. This is a more precise and appropriate approach 
for debugging. 

Agrawal and Horgan introduced a new method which uses a Dynamic De- 
pendence Graph (DDG) to take into account that the different occurrences of 
a given statement may be affected by different set of statements due to redef- 
initions of variables. In the DDG there is a distinct vertex for each occurence 
of a statement in the execution history. Using this graph precise dynamic slice 
can be created. The main drawback of using the DDG is the size of this graph. 
The number of vertices in a DDG is equal to the number of executed state- 
ments, which is unbounded. To improve the size complexity of the algorithm 
Agrawal and Horgan suggested a method for reducing the number of vertices in 
the DDG. The idea of this method is that a new vertex is created only if it can 
create a new dynamic slice. Thus the size of this reduced graph is bounded by 
the number of different dynamic slices. It was shown in [18] that the number of 
different dynamic slices is in the worst case 0(2”), where n is the number of the 
statements. 

Both dynamic slicing approaches were introduced to promote debugging. 
However, neither of these approaches arc suitable for debugging, because it may 
happen that the slice does not contain the statement which is responsible for a 
program failure. For example, consider our example program in Fig. 1. 

If we compute a precise dynamic slice for the slicing criterion ((a = 0,n = 
2) , 14^® , s) using the DDG slicing method we get the dynamic slice of the program 
presented in Fig. 2. 

We can observe that this slice does not include statements 2., 5. and 10., since 
removing these instructions (and instruction 11.) the value of s in 14. (referred 
to as the output) remains unchanged. However, it is not correct, because if the 
faulty statement is 5., that is correctly 5. a = a + 2, then the value of the 
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Fig. 2. The framed statements give the dynamic slice 



output will change in statement 14. Therefore a correct slice has to include 
those statements which, if corrected, may cause the modification of the value of 
the variables at which the program failure has been manifested. 

In [4] Agrawal et al. defined a potential dependence relation among the vari- 
able occurences and predicates. The precise definition of the potential depen- 
dence is described in the next section, however, we can explain it intuitively on 
the example program. We can say that variable x in statement 12. is potentially 
dependent on the predicate in statement 10., for the input a = 0, because if 
the evaluation of this predicate would change (from false to true), then the 
value of X (and also s at the output) would be modified as well. As opposed to 
this, variable x is not potentially dependent on the predicate in statement 9., 
because if we change the evaluation of this predicate (from true to false) for 
input a = 0, then this change has no influence on the output. On the basis of 
potential dependences Agrawal et al. introduced a new slicing method for regres- 
sion testing. This method, called relevant slicing, inserts the potentially affecting 
predicates and their data dependences into the slice. The relevant slice of our 
sample program for the slicing criterion ((a = 0, n = 2), 14^®, s) is presented in 
Fig. 3. According to the informal definition of the relevant slice, it contains the 
dynamic slice and in addition includes those statements which, if corrected, may 
cause the modification of the value of the variables at which the program failure 
has been manifested. 

In [4] Agrawal et al. give a brief description for the computation of the 
relevant slice and discuss the advantage of using such a slice during incremental 
regression testing. Their method starts with the eomputation of the dynamic slice 
using the DDG, then they extend this graph with the dependences introduced 
by potentially affecting predicates. As we have mentioned it earlier, the size of 
the DDG may be very huge, hence in the next section we present a method 
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Fig. 3. Relevant slice of the program 



for the forward computation of the relevant slice which significantly reduces the 
required memory. 

We can see from the description of relevant slicing that it contains all the 
instructions that may cause the observed failure, thus this method is a suitable 
means for program debugging. 



3 Forward Relevant Slice Algorithm 



For simplicity, we present the relevant slice algorithm in two steps. First, the 
dynamic slice is determined, then this algorithm is extended to derive the entire 
relevant slice in the second step. For clarity we demonstrate our method on a 
program with simple statements. 

Our algorithm is forward, and we obtain the necessary information, i.e. the 
dynamic (and then the relevant) slice for a given instruction as soon as this 
instruction has been executed. As a consequence, our method is global, i.e. after 
the last instruction has been executed we obtain the dynamic slice for all the 
instructions processed previously. On the contrary, former methods involving 
backward processing computes the slices only for a selected instruction (and 
variables used at this instruction) . Global slicing is not necessary for debugging, 
but can be very useful for testing. 

Our algorithm does not necessitate a Dynamic Dependence Graph. Instead, 
we compute and store the set of statements that affect the currently executed 
instruction. In this way we avoid any superfluous information (that may be 
unbounded). 
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Prior to the desciption of the algorithm we overview and introduce some 
basic concepts and notations. For clarity we rely on [12] but in some cases the 
necessary modifications have been made. 

We demonstrate our concepts for dynamic slicing by applying them on the 
example program in Fig. 1. 

We apply a program representation which considers only the dehnition and 
the use of variables, and in addition, it considers direct control dependences. 
We refer to this program representation as D/U program representation. An 
instruction of the original program has a D/U expression as follows: 

i.d: U, 

where i is the serial number of the instruction, and d is the variable that gets 
a new value at the instruction in the case of assignment statements. For an 
output statement or a predicate d denotes a newly generated ’’output variable”- 
or ” predicate- variable” -name of this output or predicate, respectively (see the 
example below). Let U = {ui,U 2 , ■■■, u„} such that any Uk & U is cither a variable 
that is used at i or a predicate-variable from which instruction i is (directly) 
control dependent. Note that there is at most one predicate- variable in each U . 
(If the entry statement is defined, there is exactly one predicate-variable in each 
U.) 

Our example has a D/U representation shown below: 



i . d: 


U 


1 . n : 

2. a: 

3. x: 

4. b: 


a,x 


5. a: 


a 


6. i: 

7. s: 

8. p8: 


i.n 


9. p9: 


b,p8 


10. plO: 


a,p9 


11. x: 


plO 


12. s: 


s ,x ,p8 


13. i: 


i ,p8 


14. ol4: 


s 



Fig. 4. D/U representation of the program 



Here p8, p9 and plO are used to denote predicate- variables and ol4 denotes 
the output-variable, whose value depends on the variable(s) used in the output 
statement. 
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3.1 Forward Computation of the Dynamic Slice 

Now wc can derive the dynamic slice with respect to an input and the related 
execution history based on the D/U representation of the program as follows. 
Wc process each instruction in the execution history starting from the first one. 
Processing an instruction i. d : U, we derive a set DynSlice{d) that contains all 
the statements which affect d when instruction i has been executed. By applying 
the D/U program representation the effect of data and control dependences can 
be treated in the same way. 

After an instruction has been executed and the related DynSlice set has been 
derived, we determine the last definition position for the newly assigned variable 
d denoted by LP[d). Very simply, the last definition position of variable d is 
the execution position where d is defined last. Obviously, after processing the in- 
struction i. d ■. U , at the execution position j, LP{d) will be j for each subsequent 
executions until d is defined next time. We also use LP{p) for predicates which 
means the last execution position of predicate p. For simplicity, the serial num- 
ber of the instruction which is executed at position LP{d) is denoted by LS{d), 
where LS{d) = EH{LP(d)) (considering the instruction i. d : U, LS{d) = i). 
For example, if EH (13) = 8 (the current action is then LS{d) = 8 and 
LP{d) = 13. We note that LP{d) is not directly used in our dynamic slicing 
algorithm, but we need it for the relevant slicing method. 

Now the dynamic slices can be determined as follows. Assume that we are 
running a program on input t. After an instruction i. d : U has been executed at 
position p, DynSlice{d) contains exactly the statements involved in the dynamic 
slice for the slicing criterion C = (f, E, U). DynSlice sets are determined by the 
equation below: 



DynSlice(d) = (^DynSlice(ui) U {L5'(ui)}^ 

UiU 

After DynSlice{d) has been derived wc determine LP{d) and LS{d) for as- 
signment and predicate instructions, i.e. 

LP{d)=p, LS{d) = EH{p) 

Note that this computation order is strict, since when we determine DynSlice[d), 
we have to consider LP{d) and LS{d) occured at a former execution position 
instead of p (consider the program line x = x + y in a loop) . 

Wc can sec that during the dynamic slice determination wc do not use a (may 
be huge) Dynamic Dependence Graph, only a D/U program representation which 
requires less space than the original source code and the method above creates 
the same dynamic slice as the application of the DDG in [4]. 

The formalization of the forward dynamic slice algorithm is presented in Fig. 



5 . 
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program DynamicSlice 
begin 

Initialize LS and DynSlice sets 

ConstructD/U 

ConstructEH 

for j = 1 to number of elements in EH 
the current D/U element is P . d : U 
DynSUce(d) = ^ {LS{uk)}) 

LS(d) = i 

endfor 

Output LS and DynSlice sets for the last definition of all variables 
end 



Fig. 5. Dynamic slice algorithm 



Now we illustrate the above method by applying it on our example program 
in Fig. 1 for the execution history (1, 2, 3, 4, 5, 6 , 7, 8, 9, 10, 12, 13, 8, 9, 10, 12, 13, 
8, 14). During the execution the following values are computed: 



Action 


d 


U 


DyriSlice{d) 


LS{d) 


li 


n 


0 


0 


1 


2 ^ 


a 


0 


0 


2 


3^ 


X 


0 


0 


3 


44 


b 


{a, x} 


{2,3} 


4 


55 


a 


{a} 


{ 2 } 


5 


6 ® 


i 


0 


0 


6 


7^ 


s 


0 


0 


7 


8 ® 


p8 


{i,n} 


{ 6 . 1 } 


8 


99 


p9 


{b,p8} 


12,3,6,1,4,8} 


9 


10i“ 


plO 


{a,p9} 


12,3,6,1,4,8,5,9} 


10 


I 2 I 1 


s 


{s,a:,p 8 } 


(6,1, 7,3,8} 


12 


13^2 


i 


{z,p 8 } 


{ 6 , 1 , 8 } 


13 


813 


p8 


{i,n} 


{6,1,8,13} 


8 


914 


p9 


{b,p8} 


{2,3,6,1,8,13,4} 


9 


IQi® 


plO 


{a,p9} 


{2,3,6,1,8,13,4,5,9} 


10 


1216 


s 


{s,a;,p 8 } 


{6,1,7,3,8,13,12} 


12 


I 3 II' 


i 


{bp8} 


{6,1,8,13} 


13 


818 


p8 


{i,n} 


{6,1,8,13} 


8 


1419 
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The final slice can be obtained as the union of DynSlice{olA) and {L5'(ol4)}. 




An Efficient Relevant Slicing Method for Debugging 313 



3.2 Forward Computation of the Relevant Slice 



Now let us extend the above algorithm to involve potential influences, i.e. to 
compute relevant slices. Note that this part involves static analysis, therefore 
precise result cannot be obtained. However, this is due to the essence of the 
relavant slice and is not a consequence of our method. 

First, we try to derive the predicates that might have potential influence on a 
selected instruction. To do this, we generate the PDG and extend it. Consider a 
data flow edge e from node n to node m. We generate new edges called conditional 
flow edges from any predicate node p from which n is (transitively) control 
dependent to node m (see the illustration below). E.g. in our example program 
there is a conditional flow edge from predicate node p = 9 to node m = 12 since 
node n = 11 is control dependent on 9 and 12 is data dependent on 11. 



X — control control x — s. data ✓ — s. 



conditional flow edges 



In addition, any conditional flow edge is annotated with the variable name 
that is defined at n. Then, without loss of generality, we introduce a boolean 
function on edges A{e) that is true if n can be reached from predicate p along 
the then branch, otherwise it is false. This result (i.e. A(e) is true or false) is 
also annotated for edge e = [p,m). In the example above the edge (9,12) is 
annotated with x and true. 

Definition. Conditional dependence. If there is a conditional flow edge from p 
to m, then there is a conditional dependence from p to m. □ 

A data flow edge (n, m) in the PDG represents a definition-use pair between 
the related instructions. If the flow of control for a given input traverses both m 
and a predicate p from which n is (transitively) control dependent, but does not 
traverse n, then there is a chance that by modifying p the evaluation of m will 
also be changed. In Fig. 6 we can see the extended PDG of the simple program 
presented in Fig. 1. 

What we need is the potential dependence first defined in [4]. For clarity, 
we give a new definition that is equivalent with the old one, but the subsequent 
relevant slice algorithm can be introduced in a more understandable way. 

Our definition needs a new boolean function B on predicate actions such that 
B(p^) = true if after the execution of action p^ the flow of control follows the 
then branch, otherwise B{jP) = false. For clarity, the identification of the PDG 
nodes is equivalent to the identification of the instructions, i.e. the node q in the 
PDG relates to the instruction q. (This can be done since each node in the PDG 
relates to a unique instruction in the module.) 
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Fig. 6. Extended PDG of the simple program 



Definition. Potential dependence. Consider an instruction i. d : U at the cur- 
rent execution position j. An Uk & U is potentially dependent on a predicate p 
if: 

(1) there is a conditional flow edge e = {p,i) annotated with the variable name 
Uk, and A[e); 

(2) there exists an action p^ for predicate p such that LP{uk) < r < j and 

(3) B{pn^A{e). □ 

If (1) holds, then there is a (static) definition-use pair for variable Uk between 
a node n and our target node i such that n is control dependent on p. If (2) also 
holds, then predicate p is executed after variable Uk has been defined (assigned) 
last. Therefore, statement n cannot be executed after action p'’. If, in addition 
(3) holds, then changing the How of control at p (by modifying p), instruction n 
may be executed affecting Uk at P . 

In our simple example in Fig. 1 for the input a = 0,n = 2 the variable x at 
the action 12^^ is potentially dependent on predicate plO because all of the three 
conditions of the definition hold. However, for predicate p9 the condition (3) does 
not hold, hence the variable x is not potentially dependent on this predicate. 

We can see that by executing an instruction i the potential dependences can 
be determined easily, i may be potentially dependent only on those predicates 
on which it is conditionally dependent. These predicates can be determined by 
considering conditional flow edges that arrive i. Whenever a predicate has been 
executed, it can be determined easily which of the branches is followed. Note that 
we have to consider only two actions for every predicate (or our method would be 
also unbounded). Thus we define for every p predicate: LastTrue{p) is the last 




An Efficient Relevant Slicing Method for Debugging 315 



execution position of p where the outcome is true and LastFalse{p) is the last 
execution position of p with false outcome (this implies that a predicate p may 
be executed several times, but we store only two execution positions at a time). 
With this extension both (2) and (3) can be checked easily: if A(e) = true then 
check LP(uk) < LastFalse{p), otherwise LP{uk) < LastTrue{p). In addition, 
we need not store the execution history, only the execution position should be 
incremented after each statement execution. 

Now wc can extend our algorithm to involve potential dependences as well. 
Assume that we are processing an instruction i. d : U . In this case the relevant 
slice contains three sets: (1) the union of the relevant slices of all Uk € U , (2) 
the predicates from which any Uk & U is potentially dependent, and (3) the 
statements that affect the predicates in (2) ignoring control dependences. (To 
demonstrate why control dependences have to be ignored, consider the example 
above, where x at 12^^ is potentially dependent on plO but the predicates from 
which plO is control dependent must not be added to the relevant slice, i.e. p9 
must be excluded!) The sets described in (2) and (3) arc denoted by PotDep{d) 
and DataDepSet{PotDep{d)), respectively. 

Thus our approach is as follows: 

RelSlice(d) = (^RelSlice{ui) U {LS{ui)}^ U 

UiU 

U PotDep{d) U DataDepSet{PotDep{d )) , 

where the set PotDep{d) contains the predicates from which any Uk e U is 
potentially dependent at the current execution position. 

DataDepSet{PotDep{d)) can be determined as the union of the dependences 
arised from the potential dependences: 

DataDepSet{PotDep(d)) = [J DataDep{p) 

p PotDep(d) 

Note that DataDep{p) for a predicate p is computed when p is actually 
executed, therefore in the above equation (when RelSlice(d) is computed) we 
only use these sets. 

Finally, the data dependence set for a predicate p that ignores p’s control 
dependences is the following: 



DataDep{p) = DataDep{p) (J [RelSlice{ui) U {L5'(ui)}^ U 

Ui V 

U DataDepSet{PotDep{p)) , 

where V contains all the variables that are used at p (but it does not contain 
any predicate). 

The formalized forward relevant slice algorithm is presented in Fig. 7. 

Now we illustrate the above method by applying it on our example program 
in Fig. 1 for the slicing criterion ((a = 0, n = 2), 14^®, s). During the execution 
the following values are computed: 
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program RelevantSlice 
begin 

Initialize LS, LP, RelSlice, DataDepSet and DataDep sets 
ConstructD /U 

Add conditional flow edges to the PDG 
ConstructEH 

for j = 1 to number of elements in EH 
the current D/U element is d : U 
Compute PotDep{d) at V 

RelSlice{d) = [J^^^jj^RelSlice{uk) U {L5(nfc)}^U 

U PotDep(d) U DataDepSet(PotDep{d)) 
if d is a predicate then 

DataDep{d) = DataDep(d) 

U DataDepSet{PotDep{d)) 

endif 

LP(d) = j 
LS{d) = i 

endfor 

Output LS and RelSlice sets for the last definition of all variables 

end 



Fig. 7. Relevant slice algorithm 
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The abbreviation DDS is used to denote DataDepSet{PotDep{d)). 
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The final slice can be obtained as the union of RelSlice{olA) and {^^(old)}. 
We can see that this slice contains all the instructions that may cause the failure 
observed at the output statement (see Fig. 2 and 3). 

4 Related Work 

Our method for the computation of dynamic and relevant slices significantly 
differs from the approaches presented in [3] and [4]. In [3] Agrawal and Horgan 
introduced different methods for determining dynamic slices. Their first two 
approaches are based on the static PDG and produce imprecise dynamic slices. 
The initial method marks those vertices of the PDG which are executed for a 
test case and computes the slice in the subgraph induced by the marked vertices. 
More precise dynamic slices can be created with the second method by marking 
those edges of the PDG where the corresponding dependences arise during the 
execution. In this case the marked edges of the PDG are traversed during the 
computation of a slice. The dynamic slices produced by this second approach 
are the same as the slices created by the algorithm proposed by Korel and Laski 
in [12]. 

The third method presented in [3] takes into account that different occur- 
rences of a statement may be affected by different set of statements. This ap- 
proach uses a Dynamic Dependence Graph. Using this graph precise dynamic 
slice can be computed, but as mentioned, the size of the DDGs may be un- 
bounded. Agrawal and Horgan also proposed a reduced DDG method. The size 
of reduced graphs is bounded by the number of different dynamic slices (sec 
Section 2 for details about the DDG and reduced DDG methods). In [18] a sim- 
ple program is presented (called Q”) which has 0(2”) different dynamic slices, 
where n is the number of statements in the program. This example shows that 
even this reduced DDG may be very huge for some programs. 

These dynamic slicing methods are not appropriate for debugging, because 
dTiring this process we also need to identify the potentially dependent predicates 
and their data dependences. The relevant slices introduced in [4] for the incre- 
mental regression testing include these statements, therefore relevant slices are 
very suitable for debugging as well. Agrawal et al. use the DDG representation 
for the computation of relevant slices [4], hence here appears the same problem of 
size requirement which we discussed at the DDG slicing algorithm. Our forward 
algorithm presented in this paper for the computation of relevant slices has no 
such space problem. 

In [14] Korel and Yalamanchili introduced a forward method for determining 
dynamic program slices. Their algorithm computes executable program slices. 
In many cases these slices are less accurate then those computed by our forward 
dynamic slicing algorithm. (Execntable dynamic slices may produce inaccurate 
results in the presence of loops [18]). The method of Korel and Yalamanchili is 
based on the notion of removable blocks. The idea of this approach is that during 
the program execution on each exit from a block, the algorithm determines 
whether the executed block should be included in the dynamic slice or not. 
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Note that this method does not compute potential dependences, therefore it is 
not really appropriate for debugging. Excellent comparison of different dynamic 
slicing methods can be found e.g in [18], [10]. 

5 Summary 

In this paper we have shown that a dynamic program slice may fail to involve 
some statements that, if faulty, would affect the output where the failure is 
revealed. On the other hand, relevant slicing introduced for regression testing 
is just a method perfectly suitable for debugging. However, the only available 
algorithm may lead to very large graphs. This involves slow performance of the 
algorithm as well. 

We have introduced a new forward global method for computing relevant 
slices. What we need is only a modified Program Dependence Graph, and paral- 
lelly to the program execution, the algorithm determines the relevant slices for 
any program instruction. Since our method is global it can also be used for other 
purposes such as automated test data generation. 

Global forward methods arc more suitable for regression testing too. Assume 
that an output instruction is executed many times for a given test case. If only 
few of the relevant slices of these different actions contain a modified statement, 
then the backward method have to be applied many times (starting from different 
execution histories). Our forward global method determines the relevant slice for 
each action for a single execution history, therefore we can terminate our method 
when the relevant slice of the output first contains the modified instruction. 

The main advantage of our approach is that its space requirement does not 
depend on the number of different dynamic slices nor on the size of the execution 
history. Therefore, this method can be applied for computing dynamic slices of 
such inputs which can produce millions of execution steps. 

We have developed an experimental system, where we implemented our for- 
ward dynamic and relevant slicing algorithms for a small subset of the C lan- 
guage. In addition to these slices, the system is able to determine the static slice, 
the execution slice (includes all statements that were executed for an input) and 
the approximate relevant slice which is defined in [4] . The approximate relevant 
slice is an extension of the DDG slice in such a way that it contains all predicates 
that were executed. (The data dependences of these predicates are also added 
to the slice.) Obviously, the relevant slice is a subset of the approximate relevant 
slice, however the computation of the latter is simpler than the computation of 
the relevant slice. 

We tested our experimental system on three G programs. Each program 
consisted of only a few hundred lines of code, but complex control structures 
were used in them. Each program was executed on six different test cases and we 
determined the different slices for the last occurence of a selected variable in the 
program. In Fig. 8 we can see the average results for these eighteen executions. 

The average number of statements in the original programs was 260. We can 
observe that for these programs the approximate relevant slice is significantly 
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Fig. 9. Time vs. size of the EH 



greater than the relevant slice, but there is no big difference between dynamic and 
relevant slices for our inputs. (Note that this small difference may contain those 
statements which are responsible for the incorrect behaviour of the program, 
however the dynamic slicing methods ignore these instructions.) 

We also tested the time requirements of our dynamic and relevant slice al- 
gorithms by using the Q" program presented in [18]. The result is presented 
in Fig. 9, where we can see that even this preliminary implementation of our 
methods (without optimization) can produce slices for large execution histories 
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in reasonable time. (The experiments were performed on a Pentium 200MMX 
machine with Microsoft NT.) 

Our future goal is to extend our approach to compute dynamic and relevant 
slices for a large subset of C. To handle arrays and pointers we can apply those 
methods that were introduced in [1], [6], [15] for dynamic and static slices. An- 
other task is to optimize the implementation of our algorithm by using more 
effective set operations and graph searching methods. 
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Abstract. Exception handling mechanisms provided by programming 
languages are intended to ease the difficulty of developing robust soft- 
ware systems. Using these mechanisms, a software developer can describe 
the exceptional conditions a module might raise, and the response of the 
module to exceptional conditions that may occur as it is executing. Cre- 
ating a robust system from such a localized view requires a developer to 
reason about the flow of exceptions across modules. The use of unchecked 
exceptions, and in object-oriented languages, subsumption, makes it dif- 
ficult for a software developer to perform this reasoning manually. In 
this paper, we describe a tool called Jex that analyzes the flow of ex- 
ceptions in Java code to produce views of the exception structure. We 
demonstrate how Jex can help a developer identify program points where 
exceptions are caught accidentally, where there is an opportunity to add 
finer-grained recovery code, and where error-handling policies are not 
being followed. 

Keywords: Exception Handling, Software Analysis, Object-Oriented 
Programming Languages, Software Engineering Tools. 



1 Introduction 

To ease the difficulty of developing robust software systems, most modern pro- 
gramming languages incorporate explicit mechanisms for exception handling. 
Syntactically, these mechanisms consist of a means to explicitly raise an excep- 
tional eondition at a program point, and a means of expressing a block of code to 
handle one or more exceptional conditions. In essence, these mechanisms provide 
a way for software developers to separate code that deals with unusual situations 
from code that supports normal processing. This separation helps a developer 
structure and reason about the code within a module. 

Unfortunately, local reasoning about the code is not generally sufficient to 
develop a module that will react appropriately to all unexpected situations. In 
some applications, such as games, it may be sufficient to trap an unexpected 
condition, write a generic error message, and terminate. In other applications, it 
is preferable to either recover silently, or to at least provide a meaningful error 
message before terminating. For instance, a user of a word processing application 
attempting to open a file that is already open may want to know that a file 
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sharing violation has occurred and may want to try to correct the problem, 
rather than just being told there was a file problem. Finer-grained reactions to 
exceptions require a software engineer to reason about code on which the module 
being constructed depends. 

The exception handling mechanisms provided in some programming lan- 
guages help a software developer perform this reasoning. Java [5] and CLU [9], 
for instance, both support the declaration of exceptions in module interfaces; 
the compiler can then check that appropriate handlers are provided in a client 
module. However, this support is only partial because each of these languages 
also provides a form of unchecked exceptions. The developer of a client module 
is not warned of the possibility of these exceptions by a compiler. Furthermore, 
object-oriented languages typically support the classification of exceptions into 
exception type hierarchies. These hierarchies introduce the possibility of writing 
general handlers that may implicitly catch more specific exceptions. This im- 
plicit catching of exceptions can complicate the development and evolution of 
robust classes [10]. 

To investigate whether information about the flow of exceptions might help a 
developer fill in these gaps, we have built a tool, called Jcx, to analyze exception 
flow in Java source code. When applied to a Java class, Jex determines the precise 
types of exceptions that may be raised at each program point, and then presents 
this information in the context of exception handling constructs in the source. 
Using this abstract view, a developer can reason about unhandled exceptions, 
and can identify and reason about exceptions handled through subsumption. 

In this paper, we describe the Jex tool and present the results of applying the 
tool to both Java library code and several sample Java applications. The analysis 
of this code with Jex indicated a number of occurrences of unhandled exceptions 
and a number of occurrences of exceptions handled implicitly. A qualitative 
investigation of these occurrences suggested places where finer-grained recovery 
code could be usefully added, and identified program points at which exception 
policies intended by a developer were not being followed. The ease by which 
these code locations were found using Jex suggests that a flow-oriented view of 
exceptions may help developers economically improve the quality of their code. 

We begin, in Sect. 2, with an overview of the terminology of exception han- 
dling and of previous work involving the analysis of exceptions. In Sect. 3, we 
detail the basic exception handling mechanism in Java. Section 4 describes the 
view of the exception structure extracted by Jex, and the means by which the 
view is produced. We describe the use of Jex on sample Java code in Sect. 5, 
and discuss issues related to the use and generality of our approaeh in Seet. 6. 
Section 7 summarizes the paper. 

2 Related Work 

Goodenough [4] introdueed the exception handling concepts in common use to- 
day. To provide a common basis for discussion, we begin with a brief review of 
these concepts and the related terminology as dehned by Miller and Tripathi. 




324 



M.P. Robillard and G.C. Murphy 



An exception is an abnormal computation state. An exception occur- 
rence is an instance of an exception. . . . 

An exception is raised when the corresponding abnormal state is 
detected. Signaling an exception by an operation (or a syntactic entity 
such as a statement or block) is the communication of an exception 
occurrence to its invoker. The recipient of the originator of an exception 
is a syntactic entity, called the exception target (or target)', the originator 
of an exception is the signaler. The target is determined either by static 
scope rules or by dynamic invocation chain. 

An exception handler (or handler) is the code invoked in response to 
an exception occurrence. It is assumed that the handler’s code is sepa- 
rate from the non-exception (or normal) control path in the program. 
Searching for eligible handlers begins with the target (i.e. the search 
starts with the handlers associated with the target). An exception is con- 
sidered handled when the handler’s execution has completed and control 
flow resumes in the normal path. An exception handled by the target 
masks the exception from the target’s invokers. 

[10, pp. 86-87] 

Variants of Goodenough’s basic models have since been realized in many 
programming languages. 

In ML [6], a functional language, exceptions are values that can be declared 
anywhere in a program. These values can be signaled at any point following their 
declaration. Because it is difficult for programmers to ensure that all exceptions 
are caught, several static analyzers have been developed to track down unhandled 
exceptions in ML, including one by Yi [14,15,17], and a second, EAT^, developed 
at the University of California at Berkeley [3]. These tools differ in the precision 
of the uncaught exceptions reported and in the form in which the information 
is reported. Yi’s tool is more precise than EAT, but EAT, which uses a more 
conservative approach, is more scalable. The EAT tool also provides support for 
visualizing the declaration and handling of exceptions at different points in the 
program. 

Other, primarily imperative, languages support the declaration of exception 
types that may arise through execution of a module. In these languages it is 
possible to specify, in the signature of a method, a list of exception types that 
can be signaled by that method. Languages differ in the kinds of checking that 
are provided concerning declared exceptions. 

In CH — h [12], the language specification ensures that a method can only raise 
exceptions it declares. If a method signature does not include a declaration of 
exceptions, it is assumed that all types of exceptions may be raised. Any ex- 
ception raised within the method that is not declared is re-mapped to a special 
unexpected exception. The developer of a client is not informed of missing han- 
dlers. 

In contrast, in Java [5] and CLU [9], the compiler ensures that clients of 
a function either handle the exceptions declared by that function, or explicitly 
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declare to signal them. In addition to these checked exceptions, Java and CLU 
also support unchecked exceptions which do not place sucli constraints on a 
client. We describe the exception handling mechanism of Java in further detail 
in the next section. 

Exception constructs are developer-oriented: they enable a developer to more 
cleanly express behavior for unusual situations. These same constructs compli- 
cate traditional analyses such as control- and data-flow analyses. R.ecent work is 
beginning to incorporate exception information into classical analysis techniques. 
For instance, Sinha and Harrold describe techniques to model control-flow in the 
presence of exceptions [11]. Choi et al. describe a representation to improve intra- 
procedural optimizations in the presence of exceptions [2]. These efforts differ 
from our work in that their focus is on modeling program execution rather than 
on enabling a developer to make better use of exception mechanisms. 

Yi and Chang [16] have sketched an approach within a set-constraint frame- 
work [7] that would provide an exception flow analysis for Java similar to that 
implemented by our tool. It is unclear whether formalization in the set-constraint 
framework will cause them to make different trade-offs between precision and 
scalability in the implementation they arc pursuing than we have made in Jcx. 
In Sect. 6, we discuss the implementation trade-offs made in Jex. 



3 Exception Handling in Java 

In Java, exceptions are first-class objects. These objects can be instantiated, as- 
signed to variables, passed as parameters, etc. An exception is signaled using a 
throw statement. Code can be guarded for exceptions within a try block. Excep- 
tions signaled through execution of code within a try block may be caught in one 
or more catch clauses declared immediately following the try block. Optionally, 
a programmer can provide a finally block that is executed independently of 
what happens in the try block. Exceptions not caught in any catch block are 
propagated back to the next level of try block scope, possibly in the caller. 

Similar to other Java objects, exceptions are typed, and types are organized 
into a hierarchy. What distinguishes exceptions from other objects is that all 
exceptions inherit from the class type java. lang.Throwable. The exception type 
hierarchy defines three different groups of exceptions: errors, runtime excep- 
tions, and checked exceptions. Errors and runtime exceptions are unchecked. 
Unchecked exceptions can be thrown at any point in a program and, if uncaught, 
may propagate back to the program entry point, causing the Java Virtual Ma- 
chine to terminate. By convention, errors represent unrecoverable conditions, 
such as virtual machine problems. 

Java requires that checked exceptions which may be thrown from the body 
of a method be declared as a part of the method signature. The language also 
requires exception conformance [10], so a method M overriding the method M 
of a supertype must not declare any exception type that is not the same type or 
a subtype of the exception types declared by M . 
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The ability to declare exceptions within a hierarchy also means that an ex- 
ception may be cast back implicitly to one of its supertypes when a widening 
conversion requires it. For example, this conversion occurs when an assignment 
of an object of a subtype is made to a variable declared to be of its supertype. 
This property is called subsumption [1]; a subtype is said to be subsumed to 
the parent type. When looking for a target, exceptions can be subsumed to the 
type of the target catch clause if the type associated with the catch clause is 
a supertype of the exception type. Similarly, a method declaring an exception 
type E can throw any of the subtypes of E without having to explicitly declare 
them. 

Java’s support for unchecked exceptions and subsumption means that it is 
impossible for a software developer to know the actual set of exceptions that 
may cross a method’s boundaries based on the method’s interface. The following 
section describes the information that is necessary to gain this knowledge. 



4 Jex: A Tool for Producing a View of the 
Exception Flow 

Understanding and evaluating how exceptions are handled within a method re- 
quires reasoning about which exceptions might arise as a method is executing, 
which exceptions are handled and where, and which exceptions are passed on. 

Manually extracting this information from source code is a tedious task for all 
but the simplest programs. In the case of an object-oriented program, a developer 
has to consider how variables bind to different parts of the type hierarchy, the 
methods that might be invoked as a result of the binding, and so on. For this 
reason, we have built the Jex tool, that automates this task for Java programs. 
In Sect. 4.1, we describe the view of exception flow and structure produced by 
Jex. In Sect. 4.2, we describe the architecture of the Jex tool. 



4.1 Extracting Exception Structure 

To retain meaning for a developer, we wanted to present a view of the exception 
flow within the context of the structure of the existing program. Our Jex tool 
thus extracts, synthesizes, and formats only the information that is pertinent 
to the task. In the case of Java, our tool extracts, for each method, the nested 
try block structures, including the guarded block, the catch clauses, and the 
finally block. Within each of these structures, Jex displays the precise type of 
exceptions that might arise from operations, along with the possible origins of 
each exception type. If an exception originates from a method call, the class name 
and method raising the exception is identified. If an exception originates from 
the run-time environment, the qualifier environment is used. This information is 
placed within a Jex file corresponding to the analyzed class. 
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We illustrate this exception structure using code from one of the constructors 
of the class java, io .FileOutputStream from the JDK 1.1.3 API.^ Figure 1 shows 
the code for the constructor; Fig. 2 shows the exception structure extracted ac- 
cording to our technique.^ The extracted structure shows that the code preceding 
the explicit try block may raise a SecurityException, and that the code inside the 
try block may result in an lOException being raised by the call to openAppend or 
open on an object of type FileOutputStream. The catch clause indicates that any 
lOException raised during the execution of the code in the try block may result 
in a FileNotFoundException being raised. FileNotFoundException is a subtype of 
lOException, the exception declared in the constructor’s signature. 

This analysis provides two useful kinds of information to a software devel- 
oper implementing or maintaining this constructor. First, the developer can see 
that the constructor may signal an unchecked SecurityException that originates 
from a checkWrite operation; a comment to this effect may be added to the con- 
structor’s header for the use of clients. Second, the developer can determine that 
the exceptions that may be raised within the scope of the try block are actually 
of type lOException and not some more specialized subtype; thus, finer-grained 
handling of the exception is not possible and should not be attempted. Neither 
of these cases would be detectable based on an inspection of the constructor’s 
source code alone. 

The analysis can also benefit a client of the constructor. Consider the code 
for the doSomething method in Fig. 3. This code will pass the checking of the 
Java compiler as there is a handler for the declared exception, lOException. 
Applying our technique to this code returns the information that the invocation 
of the FileOutputStream constructor might actually result in the more specialized 
FileNotFoundException or an unchecked SecurityException. 

Knowing the details about the exceptions flowing out of the constructor al- 
lows the developer of the client code to introduce additional handling. Figure 4 
shows an enhanced version of the doSomething client code. A handler has been 
introduced to catch SecurityException. This handler warns the user that per- 
mission to modify the file is missing. A handler is also introduced to provide a 
specialized error message for the case when a FileNotFoundException occurs. 

To conform to the constructor’s interface, it is also necessary to provide a 
handler for lOException: this handler serves to protect the client from future 
modifications of the constructor, which may result in the throwing of an IO 
exception different from FileNotfoundException. 

4.2 The Architecture of Jex 

Jex, which comprises roughly 20 000 lines of commented Java source code spread 
over 131 classes, consists of four components: the parser, the abstract syntax tree 

^ Source code for the JDK 1.1.3 API is publicly available. This source code can be 
used to determine the exceptions possibly thrown by the various methods. 

^ Figure 2 is a simplified view of the information generated by Jex. Specifically, for 
clarity in presentation, we removed the full qualification of Java names that is usually 
shown. 
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public FileOutputStream (String name, boolean append) 
throws lOException 

{ 

SecurityManager security = System.getSecurityManager () ; 
if (security != null) { 

security . checkWrite (name) ; 

} 

try { 

fd = new FileDescriptor 0 ; 
if (append) 

openAppend(name) ; 
else 

open (name) ; 

} catch (lOException e) { 

throw new FileNotFoundException(name) ; 

} 



Fig. 1. The source code for the constructor of class FileOutputStream 

FileOutputStream(String, boolean) throws lOException 

{ 

Secur ityException ; SecurityManager . checkWrite (String) ; 
try { 

lOException : FileOutputStream. openAppend (String) ; 
lOException: FileOutputStream. open (String) ; 

} 

catch ( lOException ) { 

throws FileNotFoundException; 

} 

} 



Fig. 2. The structure of exceptions for the constructor of class FileOutputStream 



(AST), the type system, and the Jex loader. We constructed the parser using 
version O.Sprel of the the Java Compiler Compiler’^” (JavaCC) [13]. The current 
implementation of the tool supports the Java 1.0 language specification. We built 
the AST using the JJTree preprocessor distributed with JavaCC. We use the 
AST to identify the structure of a class, to identify the structure of exceptions 
within a method or constructor, and to evaluate the operation invocations and 
expressions that may cause exceptions to be thrown.^ 

The AST relies on the type system to return a list of all types that override 
a particular method. Ensuring all possible types are considered in such an op- 
eration would require global analysis of all Java classes reachable through the 

^ The exception to this statement is that exceptions potentially thrown as a conse- 
quence of the initialization of static variables are not considered because it is not 
always possible to statically identify the program points where a class is first loaded. 
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public void doSomethingC String pFile ) 

{ 

try { 

FileOutputStream lOutput = new FileOutputStream( pFile, true ); 

} 

catch( lOException e ) { 

System. out .printlnC "Unexpected exception." ); 

} 

} 



Fig. 3. An example of code not using Jex information 

public void doSomethingC String pFile ) 

{ 

try{ 

FileOutputStream lOutput = new FileOutputStream ( pFile, true ); 
// Various stream operations 
} catchC SecurityException e ) { 

System. out .printlnC "No permission to write to file " + pFile ); 
} catchC FileNotFoundException e ) { 

System. out .printlnC "File " + pFile + " not found" ); 

} catchC lOException e ) { 

System. out .printlnC "Unexpected exception" ); 

} 

} 



Fig. 4. An example of code making use of Jex information 



Java cla.s.s path. This approach has the disadvantage of being overly conservative 
because unrelated classes are considered. For example, the method toString of 
class Object is often redefined by application classes. Two classes, both in the 
class path but from two unrelated applications, might each redefine toString. 
If a method in a class of the first application makes a call to toStringC), it is 
reasonable to assume that the method toString implemented by the second class 
will not be invoked. To prevent this, we restrict the analysis to a set of packages 
defined by the user. The normal Java method conformance rules are taken into 
account in establishing the potential overriding relationships between methods. 

To determine the actual exceptions thrown by a Java statement, the AST 
component relies on the Jex loader. Given a fully qualified Java type name, the 
Jex loader locates the Jex file describing that type. The AST component can then 
query the Jex loader to return the exceptions that might arise from a method 
conforming to a particular method signature for that type. The Jex files for a 
Java type are stored in a directory structure that parallels the directory structure 
of the Java source files. It is necessary to have a different directory structure for 
Jex files because some class files might not be in writable directories. The Jex 
files serve both to provide a view of the exception structure for the user, and as 
an intermediate representation for the Jex system. 
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To use Jex, a user must specify a list of packages, a path to search for Jex 
files, and a Java source code file. Currently, the Jex system requires that all 
necessary Jex files to analyze that source code file are available. We plan to 
eliminate this restriction in a future version. 

5 Evaluating Jex 

Our original hypotheses about the usefulness of the Jex tool were based on obser- 
vations made while programming in Java. Particularly in the initial construction 
of a method, it is often tempting, for expediency, to insert a catch clause that 
will simply handle all exception types. A developer might choose this course of 
action not as a result of negligence, but because insufficient information about 
the types of exceptions that might arise and the error handling policies to be 
employed prevents the implementation of an appropriate handler. 

Introducing these generalized handlers causes exceptions to be caught through 
subsumption. Although such short-cuts should be refined as development pro- 
ceeds, some occurrences may evade detection. We were interested in determining 
how often cases of exception subsumption and uncaught exceptions occur in re- 
leased code, as well as whether the detection of these cases could suggest ways 
in which the code could be made more robust. 

To investigate these two factors, we analyzed a variety of source code using 
,Tex: 

— JTar, a command-line utility for the extraction of tar files,® 

— the java, util .Vector and java, io .FileOutputStream classes from the Sun''^” 
Java Development Kit version JDK 1.1.3, 

— the Sun’^“ javax. servlet and javax . servlet .http packages, version 1.23, 

— a command-line rule parser,® and 

— four database and networking packages from the Atlas web course server 
project [8]: userDatabase, userData, userManager, and userlnf oContainers. 

Together, these packages comprise roughly 6500 commented lines of code, in- 
cluding input/output, networking, and parsing operations. 

In applying Jex to these packages, we made several choices. First, we decided 
to suppress exceptions signaled by the environment, such as ArithmeticException 
and Array IndexOutOfBoundsException, to avoid unnecessary cluttering of the re- 
sults with highly redundant exception types. Second, we did not perform the 
Jex analysis on the classes comprising the JDK API. Instead, we generated a 
Jex file for each of the relevant API classes using a script that extracts the in- 
formation from corresponding HTML files produced by Javadoc. Javadoc is a 
tool that automatically converts Java source code files containing special markup 

® Package net.vtic.tar, developed by J. Marconi and available from the Giant Java 
Tree, http://www.gjt.org. 

® Available from a compiler course web page of the School of Computing, National 
University of Singapore (http:/ /dkiong.comp.nus.edu.sg/compilers/a/). 




Analyzing Exception Flow in Java^“ Programs 331 



comments into HTML documentation. The Jex files produced from these scripts 
simply consist of a list of exception types potentially thrown by each method 
of the class. The list consists of a union of the exception types declared in the 
method’s signature with the exception types annotated in the special markup 
comments. The exception types annotated in the comments for a class may in- 
clude both declared and run-time exception types. 

The graph in Fig. 5 shows a breakdown of exceptions and their associated 
handling in the analyzed code. It represents information from all the packages 
we analyzed that contain at least one try block. Each bar in the graph shows 
the number of occurrences of different levels of subsumption in handlers. The 
level of subsumption between the type T of an exception potentially raised in a 
try block and the type T declared in a catch clause is the dilfcrencc in depth in 
the type hierarchy between T and T. Levels zero and one arc labeled by their 
semantic equivalent: “same type” and “supertype” , respectively. Exception types 
raised in a try block that cannot be subsumed to any of the types declared in 
the catch clauses remain uncaught by the try block. In all but one case, the 
Rule Parser, uncaught exceptions can propagate out of try blocks. All but two 
of the packages, namely the the Java JDK and servlet code, contain exception 
handlers that catch exceptions through subsumption. 



Exception matching in catch clauses 




0 Uncaught 
□ 3 Levels 
1 2 Levels 
I Super Type 
I Same Type 



File or package 



Fig. 5. Exception matching in catch clauses 



Table 1 provides a different view of the data. This view illustrates that 32% of 
the different exception types present in try blocks remain uncaught in a target. In 
44% of the cases, exceptions are not caught with the most precise type available. 

This data lends evidence to support our claims that exception subsumption 
and unhandled exceptions are prevalent in Java source code. However, this quan- 
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Table 1. Levels of subsumption required to catch an exception 



Level of Subsumption 


Frequency 


Same type 


24 % 


Supertype 


16 % 


2 Levels 


20% 


3 Levels 


8 % 


Uncaught 


32 % 



titative data does not indicate whether the quality of the code could be improved 
through the use of Jex-produced information. To investigate the usefulness of 
the information, we performed an after-the-fact manual inspection of the source 
code. We focused this inspection on cases of subsumption since the benefits of 
identifying uncaught exceptions are straightforward and are discussed in greater 
depth elsewhere [3,14,15,17]. 

Our investigation of the cases of exception subsumption found several in- 
stances in which knowledge of the subsumption could be used to improve the 
code. In the RuleParser application, for instance, the body of a metliod reading 
a line from an input buffer is guarded against all exceptions using the Exception 
type. This type is a supertype to much of the exception hierarchy. Expecting 
input problems, the code produces a message about a source input exception. 
However, Jex analysis reveals that two other types of runtime exception may 
also arise: StringIndexQutOfBoundsException and SecurityException. These two 
unchecked exceptions will be caught within the Exception handler, producing an 
inappropriate error message. More specific exception handlers could be added 
to improve the coherency of termination messages, or to implement recovery 
actions. 

We found other uses of subsumption in the Atlas paekages. For example, in 
a database query, exceptions signaled by reading from a stream are all caught 
by a generalized catch clause which generates a generic “read error” message 
and which re-throws a user-defined exception. However, the exceptions thrown 
in the try block include such specialized types as StreamCorruptedException, 
InvalidClassException, OptionalDataException, and FileNotFoundException. It 
may be advantageous to catch these exceptions explicitly, producing a more 
descriptive error message when one of the exceptions occurs. 

Cases of subsumption were also useful in pointing out source points at which 
exception handling code did not conform to the strategy established by the 
developer. For example, in one of the Atlas classes, an exception was explicitly 
thrown in a try block, caught in a catch clause corresponding to the same try 
block, and re-thrown. In another case, two similar accessor methods displayed 
different exception handling strategies: one masked all exceptions; the other one 
masked only two specific exceptions. A discussion with the developer of Atlas 
allowed the irregular exception handling strategies to be traced to unstable or 
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unfinished code. The abstract view of the exception flow provided by Jex made 
it easy to hone in on these suspicious cases. 



6 Discussion 

6.1 White-Box Exception Information 

By expressing the actual exceptions that may flow out of a method invocation, 
wc expose knowledge about the internals of a supplier method to a client. If 
a software developer relied upon this knowledge of a supplier’s implementation 
rather than on the supplier’s declared interface, unintended dependencies could 
be introduced, potentially limiting the evolution of the client. 

For instance, consider the case previously described for Atlas in which the 
developer learned that a particular method could receive a number of specialized 
exception types, such as StreamCorruptedException and InvalidClassException. 
Assume that the operations that can raise these exceptions declare more general 
exception types as part of their interfaces. If the developer introduced handlers 
only for each of the specialized types that could actually occur, the code might 
break if an operation evolved to signal a different specialized exception type. 
In the case of Java, this situation cannot arise because the compiler forces the 
presence of handlers for the exception types declared by supplier operations. If 
the language environment did not provide this enforcement, our approach would 
have to be extended to ensure that the use of white-box information did not 
complicate evolution. 



6.2 Alternative Approaches 

Increasing the robustness and recovery granularity of applications does not re- 
quire a static analysis tool. One alternative currently in use is to document the 
precise types of exceptions that a method may throw in comments about the 
method. With this approach, a developer can retain flexibility in a method inter- 
face, but still provide additional information to clients wishing to perform finer- 
grained recovery. A disadvantage of this approach is that it forces the developer 
to maintain consistency between the program code and the documentation, an 
often arduous task. Moreover, this approach assumes that a developer knows all 
of the exception types that might be raised within the body of the method being 
developed; the presence of runtime exceptions makes it difficult for a developer 
to provide complete documentation. 

Another course of action available is for a software developer to inspect the 
exception type hierarchy, and to provide handlers for all subtypes of a declared 
exception type. It is unlikely that in most situations the extra cost of producing 
and debugging these handlers is warranted. Our approach provides a means 
of determining cost-effectively which of the many possible handlers might be 
warranted at any particular source point. 
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6.3 The Descriptive Power of the Current Exception Structure 

The current exception structure extracted for source files enables a developer to 
determine the exceptions that can be signaled at any point in the program, along 
with the origin of these exceptions. The former information allows a developer to 
determine the actual exceptions that can cross a module boundary. The latter 
information allows a developer to trace exceptions to their source, enabling a 
more thorough inspection. 

One aspect missing from the information currently produced by Jex is a link 
to the particular statements that can produce an exception. As a result, it is not 
possible to trace actual instances of exceptions. For example, when an exception 
is explicitly thrown, it is not possible to determine if it is a new exception or 
if an existing exception instance is being re-thrown. Information about excep- 
tion instances would allow developers to reason about how specific exceptional 
conditions circulate in a program. However, it is unclear whether the additional 
benefits that could be obtained from the more specific origin information out- 
weigh the possible disadvantage of reducing the clarity and succinctness of the 
exception structure. 



6.4 The Precision of Jex Information 

There are three cases in which our Jex tool may not return conservative in- 
formation. First, Jex uses the packages specified by the user as the “world” in 
which to search for all possible implementations of a particular method. If a user 
fails to specify a relevant package, Jex may not report certain exceptions. If, in 
specifying the packages, the user fails to include a package defining a type being 
analyzed, Jex can issue a warning message. If the user fails to specify packages 
that extend types that are already defined, then Jex is unable to warn the user. 

Second, Jex relics on a model of the language environment to determine the 
exceptions that might arise from basic operations, such as an add operation, and 
the exceptions that might arise from native methods. Although the model of the 
environment we used when applying Jex to the code described in Sect. 5 was 
partial, Jex still returned information useful to a developer. 

Third, Jex does not report asynchronous exceptions [5]. An asynchronous 
exception may arise from a virtual machine error, such as running out of memory, 
or when the stop method of a thread object is invoked. Since these exceptions can 
arise at virtually any program point, we assume a user of Jex will hnd it easier 
to use the output of the tool if it is not cluttered with this information. However, 
it may prove useful, once more experience is gained with Jex, to introduce an 
option into the tool to output such exceptions as a means of reminding the user. 

If Jex returned information that was too conservative, the usability of our 
approach would likely be impacted. With Jex, this situation can arise when 
reporting all possible runtime exceptions because there are many points in the 
code that can raise such exceptions as ArithmeticException. This situation can 
be managed by providing a means of eliding this information when desired. For 
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the code we analyzed, we chose not to consider these runtime exceptions so as 
to more easily focus on application-related exception-handling problems. 

Another source of imprecision in Jex arises from the assumption that a call 
to a method made through a variable might end up binding to any conforming 
implementation on any subtype of the variable’s type. In some cases, it may be 
possible to use type inference to limit the subtypes that arc considered. Although 
our experience with Jex is limited, we have not found that this assumption 
greatly increases the exception information returned. 

7 Summary 

It is not uncommon for users of software applications to become frustrated by 
misleading error messages or program failures. Exception handling mechanisms 
present in modern languages provide a means to enable software developers to 
build applications that avoid these problems. Building applications with appro- 
priate error-handling strategies, though, requires support above and beyond that 
provided by a language’s compiler or linker. To encode an appropriate strategy, 
a developer requires some knowledge of how exceptions might flow through the 
system. 

In this paper, we have described a static analysis tool we have built to help 
developers access this information. The Jex tool extracts information about the 
structure of exceptions in Java programs, providing a view of the actual excep- 
tions that might arise at different points and of the handlers that are present. 
Use of this tool on a collection of Java library and application-oriented source 
code demonstrates that the approach can help detect both uncaught exceptions, 
and uses of subsumption to catch exceptions. 

The view of exception flow synthesized and reported by Jex can provide 
several benefits to a developer. First, a developer can introduce handlers for un- 
caught exceptions to increase the robustness of code. Second, a developer can 
determine cases in which unanticipated exceptions are accidentally handled; re- 
fining handlers for these cases may also increase code robustness. Third, inspec- 
tion of subsumption cases may indicate points where the addition of finer-grained 
recovery code could improve the usability of a system. Finally, the abstract view 
of the exception structure can help a developer detect potentially problematic 
or irregular error-handling code. The approach described in the paper and the 
benefits possible are not limited to Java, but also apply to other object-oriented 
languages. 
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Abstract. Information about which statements in a concurrent program may hap- 
pen in parallel {MHP) has a number of important applications. It can be used in 
program optimization, debugging, program understanding tools, improving the 
accuracy of data flow approaches, and detecting synchronization anomalies, such 
as data races. In this paper we propose a data flow algorithm for computing a 
conservative estimate of the MHP information for Java programs that has a worst- 
case time bound that is cubic in the size of the program. We present a preliminary 
experimental comparison between our algorithm and a reachability analysis al- 
gorithm that determines the "ideal" static MHP information for concurrent Java 
programs. This initial experiment indicates that our data flow algorithm precisely 
computed the ideal MHP information in the vast majority of cases we examined. 
In the two out of 29 cases where the MHP algorithm turned out to be less than ide- 
ally precise, the number of spurious pairs was small compared to the total number 
of ideal MHP pairs. 



1 Introduction 

Information about which statements in a concurrent program may happen in parallel 
{MHP) has a number of important applications. It can be used for detecting synchro- 
nization anomalies, such as data races [6], for improving the accuracy of various data 
flow analysis and verification approaches (e.g. [14,9, 19]), for improving program under- 
standing tools, such as debuggers, and for detecting program optimizations. For example, 
in optimization, if it is known that two threads of control will never attempt to enter a 
critical region of code at the same time, any unnecessary locking operations can be 
removed. 

In general, the problem of precisely computing all pairs of statements that may 
execute in parallel is undecidable. If we assume that all control paths in all threads of 
control are executable, then the problem is AF-complete [21]. In this paper, we call the 
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solution with this assumption the ideal MHP information for a program. In practice, a 
trade-off must be made where, instead of the ideal information, a conservative estimate 
of all MHP pairs is computed. In this context a conservative estimate contains all the pairs 
that can actually execute in parallel but may also contain spurious pairs. The precision 
of such approaches can be measured by comparing the set of pairs computed by an 
approach with the ideal set, if the latter is known. 

In this paper we propose a data flow algorithm for computing a conservative estimate 
of MHP information for Java programs that has a worst-case time bound that is cubic in 
the size of the program. In the rest of this paper we refer to this algorithm as the MHP 
algorithm. To evaluate the practical precision of our algorithm, we have carried out a 
preliminary experimental comparison between our algorithm and a reachability-based 
algorithm that determines the ideal MHP information for concurrent Java programs. 
Of course, since this reachability algorithm can only be realistically applied to small 
programs, our experiment was restricted to programs with a small number of statements. 
This initial experiment indicates that our algorithm precisely computed the ideal MHP 
information in the vast majority of cases we examined. In the two out of 29 cases where 
the MHP algorithm turned out to be less than ideally precise, the number of spurious 
pairs was small compared to the total number of ideal MHP pairs. 

Several approaches for computing MHP information for programs using various 
synchronization mechanisms have been suggested. Callahan and Subhlok [4] proposed a 
data flow algorithm that computes, for each statement in a concurrent program with post- 
wait synchronization, the set of statements that must be executed before this statement 
can be executed (B4 analysis). Duesterwald and Soffa [6] applied this approach to the 
Ada rendezvous model and extended B4 analysis to be interprocedural. Masticola and 
Ryder [15] proposed an iterative approach that computes a conservative estimate of the 
set of pairs of communication statements that can never happen in parallel in a concurrent 
Ada program. (The complement of this set is a conservative approximation of the set of 
pairs that may occur in parallel.) In that work, it is assumed initially that any statement 
from a given process can happen in parallel with any statement in any other process. 
This pessimistic estimate is then improved by a series of refinements that are applied 
iteratively until a fixed point is reached. This approach yields more precise information 
than the approaches of Callahan and Subhlok and of Duesterwald and Soffa. Masticola 
and Ryder show that in the worst case the complexity of their approach is (9 (S'®), where 
S is the number of statements in a program. 

Recently, Naumovich and Avrunin [ 1 7] proposed a data flow algorithm for computing 
MHP information for programs with a rendezvous model of concurrency. Although the 
worst-case complexity of this algorithm is C>(5®), their experimental results suggest 
that the practical complexity of this algorithm is cubic or less in the number of program 
statements* . Furthermore, the precision of this algorithm was very high for the examples 
they examined. For a set of 132 concurrent Ada programs, the MHP algorithm failed to 
find the ideal MHP information in only 5 cases. For a large majority of the examples, 
the MHP algorithm was more precise than Masticola and Ryder’s approach. 

The MHP algorithm described in this paper is similar in spirit to the algorithm pro- 
posed for the rendezvous model but has a number of significant differences prompted 
by the difference between the rendezvous-based synchronization in Ada and the shared 
variable-based synchronization in Java. First, the program model for Java is quite differ- 

' The size of the program model is 0{S'^) in the worst case, and the worst-case complexity of 
the algorithm is cubic in the size of the program model. It appears that in practice the size of 
the program model is linear in the number of program statements S' [17]. 
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ent from the one for Ada. Second, while the algorithm for Ada relies on distinguishing 
between only two node types (nodes representing internal operations in processes and 
nodes representing inter-process communications), the algorithm for Java has to distin- 
guish between a number of node types corresponding to the eclectic Java synchronization 
statements. This also implies that the two algorithms employ very different sets of flow 
equations. Third, while the algorithm for Ada operates on a completely precomputed 
program model, for Java partial results from the algorithm can be used to modify the 
program model in order to obtain a more precise estimate of MHP information. Finally, 
the worst-case complexity of the MHP algorithm for Java is only cubic in the number 
of statements in the program. 

The next section briefly introduces the Java model of concurrency and the graph 
model used in this paper. Section 3 describes the MHP algorithm in detail and states the 
major results about its termination, worst-case complexity, and conservativeness. Sec- 
tion 4 describes the results of an experiment in which MHP information was computed 
for a number of concurrent Java programs using both the MHP algorithm and the reach- 
ability approach in order to evaluate the precision and performance of the algorithm. We 
conclude with a summary and discussion of future work. 



2 Program Model 

2.1 Java Model of Concurrency 

In Java, concurrency is modeled with threads. Although the term thread is used in the 
Java literature to refer to both thread objects and thread types, in this paper we call 
thread types thread classes and thread instances simply threads. Any Java application 
must contain a unique main ( ) method, which serves as the “main” thread of execution. 
This is the only thread that is running when the program is started. Other threads in the 
program have to be started explicitly, either by the main thread or by some other already 
running thread calling their start ( ) methods. 

Java uses shared memory as the basic model for communications among threads. In 
addition, threads can affect the execution of other threads in a number of other ways, 
such as dynamically starting a thread or joining with another thread, which blocks the 
caller thread until the other thread finishes. 

The most important of the Java thread interaction mechanisms is based on monitors. 
A monitor is a portion of code (usually, but not necessarily, within a single object) in 
which only one thread is allowed to run at a time. Java implements this notion with locks 
and synchronized blocks. Each Java object has an implicit lock, which may be used 
by synchronized blocks and methods. Before a thread can begin execution of a 
synchronized block, this thread must first acquire the lock of the object associated 
with this block. If this lock is unavailable, which means that another thread is executing a 
synchronized block for this lock, the thread waits until the lock becomes available. 
A thread releases the lock when it exits the synchronized block. Since only one 
thread may be in possession of any given lock at any given time, this means that at most 
one thread at a time may be executing in one of the synchronized blocks protected 
by that lock. A synchronized method of an object ob j is equivalent to a method in 
which the whole body is a synchronized block protected by the lock of the object 
obj . 

Threads may interrupt their execution in monitors by calling the wait { ) method 
of the lock object of this monitor. During execution of the wait ( ) method, the thread 
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class Writer extends Thread 

{ 

public static void 
main (String [] args) 

{ 

Buffer buffer = new Buffer!) ; 
Reader rl = 

new Reader (buff er) ; 

Reader r2 = 

new Reader (buff er) ; 
rl . start ( ) ; 
r2 . start ( ) ; 
while (notEnough ( ) ) 

{ 

synchronized (buffer) 

{ 

buf fer . write ( ) ; 
buffer . notifyAll ( ) ; 



class Reader extends Thread 
{ 

Buffer buffer; 
public Reader (Buf fer b) 

{ 

buffer = b; 

} 

public void run() 

{ 

while (notEnough 0 ) 

{ 

synchronized (buffer) 

{ 

while (buffer. isEmptyO ) 
{ 

buffer .wait () ; 

} 

buffer . read ( ) ; 



rl. joinO ; } 

r2 . j oin ( ) ; } 

} 

Fig. 1. Java code example 



releases the lock and becomes inactive, thereby giving other threads an opportunity to 
acquire this lock. Such inactive threads may be awakened only by some other thread 
executing either the notify! ) or the notifyAll () method of the lock object. 
The difference between these two methods is that notify ( ) wakes up one arbitrary 
thread from all the potentially many waiting threads and notifyAll ( ) wakes up all 
such threads. Similar to calls to wait ( ) , calls to the notify ( ) and notifyAll ( ) 
methods must take place inside monitors for the corresponding locks. Both notifica- 
tion methods are non-blocking, which means that the notification call will return and 
execution will continue, whether there are waiting threads or not. 

The example in Figure 1 illustrates some of the Java concurrency constructs. In this 
example, one thread writes into and two threads read from a shared memory buffer. (The 
source code for the buffer is not shown.) The main thread Writer instantiates a buffer 
object and also instantiates two threads r 1 and r2 of thread type Reader. Note that r 1 
and r2 do not start their execution until the main thread calls their start ( ) methods. 
Each of the threads repeatedly calls notEnough ( ) , a function not shown in the figure 
that determines whether enough data has been written and read. If notEnough ( ) 
returns true, each thread tries to enter the monitor associated with the buffer object. If the 
main thread enters the monitor, it writes to the buffer and then calls the not i f yAl 1 ( ) 
method of the buffer, waking up any threads waiting to enter the monitor, if any. It 
then leaves the monitor and calls notEnough ( ) again. If a reader enters the monitor, 
it checks whether the buffer contains any data. If so, it reads the data and leaves the 
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monitor. If the buffer is empty, the reader calls the wait ( ) method of the buffer object, 
relinquishing the lock associated with the monitor. The reader then sleeps until awakened 
by the main thread calling buffer . notifyAll ( ) . After a reader thread wakes up, 
it has to reacquire the lock associated with the buffer object, before it can re-enter the 
monitor. After it re-enters the monitor, it reads from the buffer and then exits the monitor. 
After the main thread’s call to function notEnough ( ) evaluates to false, it exits 
its loop and calls the join () methods of threads rl and r2. If thread rl has not 
terminated by the time the main thread starts executing the call to rl . j oin ( ) , the 
main thread will block until thread rl terminates, and similarly for thread rl. Thus, 
these two calls to j oin ( ) ensure that both reader threads terminate before the main 
thread does. (The astute reader will notice that this program may deadlock with one 
of the reader threads waiting for a notification while the buffer is empty and the writer 
having exited its loop.) 

In the rest of the paper we refer to start (), j oin (), wait () , notify ( ) , and 
notifyAll ( ) methods as thread communication methods 



2.2 Parallel Execution Graph 

We use a Parallel Execution Graph (PEG) to represent concurrent Java programs. The 
PEG is built by combining, with special kinds of edges, control flow graphs (CFGs) for 
all threads that may be started in the program. In general, the number of instances of 
each thread class may be unbounded. For our analysis we make the usual static analysis 
assumption that there exists a known upper bound on the number of instances of each 
thread class. In addition, we assume that alias resolution has been done (e.g., using meth- 
ods such as [10,5]). After alias resolution is performed, we can use cloning techniques 
(e.g. [20]) to resolve object and method polymorphism. Under these assumptions, our 
program model and algorithm handle “complex” configurations of Java concurrency 
mechanisms, such as nested synchronized blocks, multiple monitors, multiple in- 
stances of lock objects, etc. For example, suppose that alias resolution determines that at 
some point in the program a particular variable may possibly refer to two different ob- 
jects. If at that point in the program this variable is used to access a monitor, cloning will 
produce a structure with two branches, where one branch contains the monitor access 
with the first object as the lock and the other branch contains the monitor access with the 
second object as the lock. Thus, alias resolution and cloning techniques are important 
for improving the precision of the MPIP analysis. 

At present we inline all called methods, except communication methods, into the 
control flow graphs for the threads. This results in a single CFG for each thread. Each call 
to a communication method is labeled with a tuple (obj ect, name, caller), where 
name is the method name, obj ect is the object owning method name, and caller 
is the identity of the calling thread. For example, for the code in Figure 1, the call 
tl . start in the main method will be represented with the label (tl, start, main). 
For convenience, we will use this notation for label nodes that do not correspond to 
method calls by replacing the object part of the label with the symbol ’ . For example, 

^ Note that concurrency primitives j oin ( ) and wait ( ) in Java have “timed” versions. Also, 
additional thread methods stop ( ) , suspend ( ) , and resume ( ) are defined in JDK 1.1 but 
have been deprecated in JDK 1.2 since they encourage unsafe software engineering practices. 
It is not difficult to incorporate handling of these statements in our algorithm but because of 
space limitations we do not do this here. 
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(lock, wait, t) V 



1 




(lock, wait, t 


) 







(lock, waiting, t) 



I 



(lock, notif ied-entry. 



Fig. 2. CFG transformation for wait ( ) method calls 



the first node of a thread t is labeled (*, begin, t) and the last node of this thread is 
labeled (*, end, t). 

To make it easy to reason about groups of communications, we overload the symbol 

to indicate that one of the parts of the communication label can take any value. For 
example, (t, start, *) represents the set of labels in which some thread in the program 
calls the start method of thread t. We will write n e (t, start, *) to indicate that 
n is one of the nodes that represent such a call. It will be clear from the context whether 
a tuple (obj ect, name, caller) denotes a label of a single node or a set of nodes 
with matching labels. 

For the purposes of our analysis, additional modeling is required for wait () 
method calls and synchronized blocks. Because an entrance to or exit from a 
synchroni zed block by one thread may influence executions of other threads, we rep- 
resent the entrance and exit points of synchronized blocks with additional nodes la- 
beled 

(lock, entry, t) and (lock, exit, t), where t is the thread modeled by the CFG 
and lock is the lock object of the synchronized block. We assume that the thread 
enters the synchronized block immediately after the entry node is executed and exits 
this block immediately after the exit node is executed. Thus, the entry node is outside 
the synchronized block and the exit node is inside this block. 

The execution of a wait ( ) method by athread involves several activities. The thread 
releases the lock of the monitor containing this wait ( ) call and then becomes inactive. 
After the thread receives a notification, it first has to re-acquire the lock of the monitor, be- 
fore it can continue its execution. To reason about all these implicit activities of a thread, 
we perform a transformation that replaces each node representing await ( ) method call 
with three different nodes, as illustrated in Figure 2. The node labeled (lock, wait, t) 
represents the execution of the wait ( ) method, the node labeled (lock, waiting, t) 
represents the thread being idle while waiting for a notification, and the node labeled 
(lock, notif ied-entry, t) represents the thread after it received a notification and 
is in the process of trying to obtain the lock to re-enter the synchronized block. The 
shaded regions in the figure represent the synchronized block. 

The CFGs for all threads in the program are combined in a PEG by adding spe- 
cial kinds of edges between nodes from different CFGs. We define the following edge 
kinds. A waiting edge is created between waiting and notif ied-entry nodes 
(the bold edge in Figure 2). A local edge is a non-waiting edge between two nodes 
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Fig. 3. PEG example 



from the same CFG. For example, the edge between nodes labeled (lock, wait, t) 
and (lock, waiting, t) is a local edge. A notify edge is created from a node m to a 
node n if m is labeled (obj , notify, r) or (obj , notifyAll, r) and n is labeled 
(obj , notif ied-entry, s) for some thread object obj , where threads r and s are 
different. The set of notify edges is not precomputed but rather built during the algo- 
rithm. This improves the precision of the algorithm since information does not propagate 
into notif ied-entry nodes from notify and notifyAll nodes until it is deter- 
mined that these statements may happen in parallel. A start edge is created from a node 
m to a node n if m is labeled (t, start, *) and n is labeled (+, begin, t). That is, m 
represents a node that calls the start ( ) method of the thread t and n is the first node 
in the CFG of this thread. All start edges can be computed by syntactically matching 
node labels. 

Figure 3 shows the PEG for the program in Figure 1 . The shaded regions include 
nodes in the monitor of the program; thin solid edges represent local control flow within 
individual threads; the thick solid edges are waiting edges; the dotted edges are start 
edges; and the dashed edges are notify edges. (Note that these notify edges are notpresent 
in the PEG originally but will be created during execution of the MHP algorithm.) 
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2.3 Additional Terminology 

For convenience, we define a number of functions. LocalPred{n) returns the set of 
all immediate local predecessors of n; NotifyPred{n) returns the set of all notify pre- 
decessors of a notif ied-entry node n; StartPred{n) returns the set of all start 
predecessors of a begin node n; and WaitingPred{n) returns a single waiting prede- 
cessor of a notif ied-entry node. Sets of successors LocalSucc{n), NotifySucc{n), 
StartSucc{n), and WaitingSucc{n) are defined similarly. Let T denote the set of all 
threads that the program may create. Let N{t) denote the set of all PEG nodes in thread 
t e T. Furthermore, we define a function thread ■. N ^ T that maps each node in the 
PEG to the thread to which this node belongs. For example, for the PEG in Eigure 3, 
thread{nz) = main and thread{ni^) = rl. 

Eor convenience, we associate two sets with each lock object. notifyNodes(6bj ) is 
the setof all notify and not if yAll nodes for lock object obj : notifyNodes(6bj ) = 
(obj , notify, *) U (obj , not if yAll, *). Similarly, waitingNodes(ohj) is the set 
ofallwai ting nodes for lock object obj ; waitingNodes(ohj) — (obj , waiting, *). 
For the example in Figure 3, notifyNodes(buffer) = {nr} and waitingNodes(buffer) 
{nir,n2r}. 

In Java, monitors are given explicitly by synchronized blocks and methods. 
Since our model captures a set of known threads, we can also statically compute the 
set of nodes representing code in a specific monitor. Let Monitorobj denote the set of 
PEG nodes in the monitor for the lock of object obj . For the example in Figure 3, 

Monitor butter = {ne,nr,ns,ni5,niQ,nig,n2o,n25,n26,n29,n3o}. 

3 The MHP Algorithm 

In this section we present the data flow equations for the MPIP algorithm. We do this 
instead of using the lattice/function space view of data flow problems [7] since it makes 
explanations of this algorithm more intuitive and, as will be evident, one aspect of 
this algorithm precludes its representation as a purely forward- or backward data flow 
problem or even as a bidirectional or multisource [16] data flow problem. At the end 
of this section we present a pseudo-code version of the worklist version of the MHP 
algorithm. 

3.1 High-Level Overview 

Initially we assume that each node in the PEG may not happen in parallel with any other 
nodes. The data flow algorithm then uses the PEG to infer that some nodes may happen 
in parallel with others and propagates this information from one node to another, until a 
fixed point is reached. At this point, for each node, the computed information represents 
a conservative overapproximation of all nodes that may happen in parallel with it. 

To each node n of the PEG we assign a set M (n) containing nodes that may happen 
in parallel with node n, as computed at a given point in the algorithm. In addition to 
the M set, we associate an OUT set with each node in the PEG. This set includes MHP 
information to be propagated to the successors of the node. The reason for distinguishing 
between the M {n) and OUT (n) sets is that, depending on the thread synchronizations 
associated with node n, it is possible that a certain node m may happen in parallel with 
node n but may never happen in parallel with n’s successors or that some nodes that may 
not happen in parallel with n may happen in parallel with n’s successors. Section 3.4 
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gives a detailed description of all cases where nodes are added to or removed from the 
M set of a node to obtain the OUT set for this node. Initially, M and OUT sets for all 
nodes are empty. 

We propose a worklist form of the MHP algorithm. At the beginning, the worklist 
is initialized to contain all start nodes in the main thread of the program that are 
reachable from the begin node of this main thread. The reason for this is that places 
in the main thread of the program where new threads are started are places where new 
parallelism is initiated. The MHP algorithm then runs until the worklist becomes empty. 
At each step of the algorithm a node is removed from the worklist and the notify edges 
that come into this node, as well as the M and OUT sets for this node, are recomputed. If 
the OUT set changes, all successors of this node are added to the worklist. The following 
four subsections describe the major steps taken whenever a node is removed from the 
worklist. This node is referred to as the current node. 



3.2 Computing Notify Edges 

Notify edgesconnectnodesrepresentingcalls to notify 0 andnotifyAll () meth- 
ods of an object to notif ied-entry nodes for this object. The intuition behind 
these edges is to represent the possibility that a call to notify ( ) or notifyAll ( ) 
method wakes a waiting thread (the waiting state of this thread is represented by the 
corresponding waiting node) and this thread consequently enters the corresponding 
notif ied-entry node. This is possible only if the waiting node and the notify 
or notifyAll node may happen at the same time. Thus, the computation of notify 
successors for the current node can be captured concisely as 

{ {m|m e (obj , notif ied-entry, *) 

A WaitingPred{m) G M{n)}, if n G notifyNodes(dhj) 

undefined, otherwise. 



3.3 Computing M Sets 

To compute the current value of the M set for the current node, we use the OUT sets 
of this node’s predecessors, as well as information depending on the label of this node. 
Equation (1) gives the rule for computing the M set for nodes with all possible labels 
(here and in the rest of the paper, “\” stands for the set subtraction operation). 



M[n) = M{n) U 



(Up StanPred(n) OUT {p) 

\ N {thread (n))) , 

< ((Up NotifyPred{n) OUT{p)) 
n OUT {WaitingPred{n))) 
U GEN notifyAll{'^') t 

.Up LocalPred{n) OUT {p^ ^ 



if n G (*, begin, *) 



if n G (*, notif ied-entry, #) 
otherwise 

( 1 ) 



As seen in equation (1), for begin nodes, the M set is computed as the union 
of the OUT sets of all start predecessors of this node, with all nodes from the thread 
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of the current node excluded^. The explanation is that since the start { ) method is 
non-blocking, the first node in the thread that is started may execute in parallel with all 
nodes that may execute in parallel with the node that started it. 

For a notif ied-entry node n, first we compute the union of the OUT sets of 
all notify predecessors of this node. The resulting set of nodes is then intersected with 
the OUT set of the waiting predecessor of n and then the GEN notijyAii{n) set (defined in 
equation (2) below) is added to the result. The intuition behind taking the union of the 
OUT sets of all notify predecessors is that once a thread executes wait ( ) , it becomes 
idle, and quite some time can pass before it is awakened by some other thread. Only after 
thishappens(aftera notify 0 ornotifyAll () method call) can this thread resume 
its execution. This means that, in effect, these notify and notifyAll nodes are the 
“logical” predecessors of the node that follows the waiting node. The reasoning for 
intersecting the resulting set with the OUT set of n’s waiting predecessor is that n 
can execute only if (1) the thread of n is waiting for a notification and (2) one of the 
notify predecessors of n executes. 

The GEN notifyAll set in equation (1) handles the special case of a notifyAll state- 
ment awakening multiple threads. In this case the corresponding notif ied-entry 
nodes in these threads may all execute in parallel. We conservatively estimate the sets 
of such notif ied-entry nodes from threads other than that of the current node n. 
A node m is put in GEN notifyAii{n) if rn refers to the same lock object obj as n does, 
the WaitingPred nodes of m and n may happen in parallel, and there is a node r labeled 
(obj , notifyAll, >i<) that is a notify predecessor of both m and n. Formally, 



GEN notifyAui^'^) 



0, if n 0 (obj , notif ied-entry, *) 

{m|m e (obj , notif ied-entry, *)A 
WaitingPred {n) £ M {WaitingPred {m)) A 
(3r e N : r £ (obj , notifyAll, !|>)A 
r e {M {WaitingPred {m)) n M {WaitingPred {n))))} , 

if n £ (obj , notif ied-entry, *) 

(2) 



3.4 Computing OUT Sets 

The OUT (n) set represents MHP information that has to be passed to the successors of 
n and is computed as shown in equation (3). 

OUT{n) = {M{n) U GEN{n)) \ KlLL{n) (3) 

GEN{n) is the set of nodes that, although they may not be able to execute in parallel 
with n, may execute in parallel with n’s successors. KILL{n) is a set of nodes that must 
not be passed to n’s successors, although n itself may happen in parallel with some of 
these nodes. Computation of both GEN and KILL for the current node depends on the 
label of this node. 



^ begin nodes are included in the OUT sets of the corresponding start nodes. See equa- 
tions (3) and (4). 
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The following equation gives the rule for computing the GEN set for the current 
node n. 



GEN{n) 



(*, begin, t), 
< NotifySucc{n), 



if n e (t, start, 

if Elob j : n e notifyNodes(ohj ) 

otherwise 



( 4 ) 



For start nodes, GEN consists of a single node that is the begin node in the thread 
that is being started. Suppose, the current node is in thread r, starting thread t. Once 
this node is executed, thread t is ready to start executing in parallel with thread r. Thus, 
the begin node of thread r has to be passed to all successors of the current node. Note 
that thread r cannot start until start node completes its execution. This means this 
start node and the begin node of thread r may not happen in parallel, and so the 
begin node is not in the M set of the start node. 

For notify and notifyAll nodes, the GEN set equals the set of all their notified 
successors. This conveys to the local successors of such notify and notifyAll 
nodes that they may happen in parallel with all such notif ied-entry nodes. Note 
that these notif ied-entry nodes may or may not be in the M sets of the notify 
and notifyAll nodes because a thread that is being awakened becomes notified only 
after the corresponding notify or notifyAll node completes its execution. 

The computation of the KILL[n) set is shown below. 



KILL{n) 



Monitor , 

< waitingNodes(6hj), 



if n G (t, join, *) 
if n G (ob j , entry, *)U 
(obj , notif ied-entry, *) 
if (n G (obj , notify, *)A 
I waitingNodes(6hj) \= 1)V 
{n G (obj , notifyAll, *)) 
otherwise 



( 5 ) 



If the current node n represents j oining another thread t, the thread containing the 
current node will block until thread t terminates. This means that after n completes its 
execution, no nodes from t may execute. Thus, all nodes from M (n) n N{t) should be 
taken out of the set being passed to n’s successors. 

Computing the KILL set for ent ry and not i f ied - ent ry nodes is quite intuitive. 
While a thread is executing in a entry or notif ied-entry node, it is not in the 
monitor entrance that this node represents. Once the execution of this node terminates, the 
thread is inside this monitor. Thus, the successors of such ent ry or no t i f i ed - ent ry 
node may not happen in parallel with any nodes from this monitor. 

Finally, if the current node is a notifyAll node for lock object obj , this means 
that once this node completes its execution, no threads in the program will be waiting on 
this object. Thus, no nodes labeled (obj , waiting, = 1 =) must be allowed to propagate 
to the local successors of the current node. If the current node is a notify node, its 
execution wakes up no more than one thread. If there is exactly one waiting node, 
this waiting node must finish execution by the time this notify node finishes its 
execution. 
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3.5 Symmetry Step 

Up to this point the algorithm is a standard forward data flow algorithm. After computing 
M and OUT sets for each node, however, we have to take a step that is outside this 
classification to ensure the symmetry rii G M{n 2 ) ri 2 G M{ni). We do this by 
adding ri 2 to M (ni ) if rii G M (ri 2 ) . The nodes whose M sets have been updated in this 
way are added to the worklist, since the change in their M sets may result in a change 
in their OUT sets, and so influence other nodes in the graph. 

3.6 Worklist Version of the MHP Algorithm 

The Java MHP algorithm, based on the equations described above, consists of two 
stages. The initialization stage computes KILL sets for all nodes, as well as the GEN sets 
for the start nodes. All steps of this stage correspond to computations described by 
equations (4) and (5). The iteration stage computes M and OUT sets and notify edges 
using a worklist containing all nodes that have to be investigated. Both stages of the 
algorithm are shown in Figure 4. 

3.7 Termination, Conservativeness, and Complexity 

For Java programs satisfying the assumptions noted in Section 2, we have proved that 
the MHP algorithm terminates, that MHP information computed by this algorithm is 
conservative, in the sense that the sets it computes contain the ideal MHP sets, and that 
the worst-case time bound for this algorithm is CJ(|W|^). The precise formulations of 
these statements and their proofs are given in [18]. 



4 Experimental Results 

We measure the precision of both the MHP algorithm and the reachability analysis 
by the number of pairs of nodes that each approach claims can happen in parallel. 
The smaller this number, the more precise is the approach, since both approaches can 
never underestimate the set of nodes that may happen in parallel. We write Pmhp for 
the set of pairs found by the MHP algorithm and Ppeach for the set of pairs found 
by the reachability analysis. We say that the MHP algorithm is perfectly precise if 
Pmhp = Ppeach- For the cases where this equality does not hold, we are interested in the 
ratio between the number of spurious MHP pairs and the number of all pairs found by 
the reachability analysis {\Pmhp \ PReach\)/\PReach\- (Note that conservativeness of our 
algorithm guarantees PR^ach ^ Pmhp-) 

Because there is no standardized benchmark suite of concurrent Java programs, 
we collected a set of Java programs from several available sources. We modified these 
programs so that we could use a simple parser to create the PEG without any preliminary 
semantical analysis. In addition, we removed the timed versions of the synchronization 
statements and any exception handling, since these are currently not handled in our 
algorithm. 

The majority of our examples came from Doug Lea’s book on Java concurrency [1 1] 
and its Web supplement [12]. For most of these examples Lea gives only the classes 
implementing various synchronization schemes, sometimes with a brief example of 
their use in concurrent programs. We used these synchronization schemes to construct 
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Input: CFGs for all threads in the program and Vn G N : sets KILL(n) and GEN{n) 

Output: Vn G N : a set of PEG nodes M{n) such that Vm ^ M : m may never happen in 
parallel with n. 

Additional Information: W is the worklist containing nodes to be processed 

Vn e N, OUT(n) is the set of nodes to be propagated to the successors of n 

Initialization: Vn e N : KILL{N) = GEN(N) = M(n) = OUT(n) = 0 

Initialize the worklist W to include all start nodes in the main thread that are reachable from 

the begin node of the main thread 

THE FIRST STAGE: 

(1) Vn € N : 

(2) case 

(3) n € (t, j oin, *) ^ KILL{n) = N(t) 

(4) n E (obj , entry, *)U (obj , notif ied-entry, *) =t KlLL(n) = Monitor 

(5) n e (obj , notifyAll, *) => KlLL(n) = waitingNodes(obj) 

(6) n S (obj , notify, *) ^ 

(7) if I waitingNodes{ohj) |= 1 then 

( 8 ) KILL{n) = waitingNodes(ohj) 

(9) n € (t, start, *) =t GEN{n) = ( + , begin, t) 

THE SECOND STAGE: Main loop. 

We evaluate the following statements repeatedly until W — fb 

// n is the current node: 

(1) n = headiVd) 

// n is removed from the worklist: 

(2) W = tail(W) 

H M„ki, OUT,,!, I, and NotifySucCaU are the copies of the M , OUT, and NotifySucc sets for this node, 
// computed to determine new nodes inserted in these sets on this iteration 

(3) M„id = M(n) 

(4) OUT„u = OUT{n) 

(5) NotifySucCoU! = NotifySucc(n) 

// computing the new set of notify successors for notify and notifyAll nodes 

(6) if 3o ; Ti e notifyNodesiobb ) then 

(7) Vrn € M{n) n waitingNodesiobj): 

// create a new notify edge from node n to the waiting successor of node m 

(8) NotifySucc(n) = NotiJySiicc(n) U {WaitingSucc{m)} 

// if new notify edges were added from this node, add all notify successors 
// of this node to the worklist 

(9) if Notify Succoidi'^1-) NotiJySucc{n) ihen 

(10) W = WU NotifySucc{n) 

(11) Compute the set GEN„„ti/ydti{n) as in equation (2) 

(12) Compute the set M{n) as in equation ( 1 ) 

// the only nodes for which the GEN set has to be recomputed are notify and 
// notifyAll nodes; their GEN sets are their notify successors: 

(13) if3o : n G «o///yAorfev(ob j ) then 

(14) GEN{n) = NoHfySucc(n) 

(15) Compute the set OUT{n) as in equation (3) 

// do the symmetry step for all new nodes in M{n): 

(16) if Mold yf M(n) then 

(17) Vrn G (M(n) \ M,dd{n)): 

(18) M{m) = M(m) U {rt} 

it add m to the worklist because the change in M (to) may lead to a 
it change in OUT(m) 

(19) W = W\J{m} 

H if new nodes has been added to the OUT set of rr, add all n’s successors to the worklist 

(20) if OUT „,d 7 ^ OUT{n)-. 

(21) W = IV U (LocalSucc(n) U StartSucc(n)) 



Fig. 4. MHP algorithm 
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Fig. 5. Raw experimental data 



complete multi-threaded programs. We selected sizes of the examples that could be 
handled by our reachability tool. 

Several examples in our set came from other sources [1, 2, 3] on the Web. Finally, 
we wrote Java implementations for several of the Ada concurrent examples, such as 
dining philosophers, that are commonly used in the concurrency analysis literature. All 
29 examples that we used in our experiments are described in [18]. 

For each example we compute three times: the time to build the PEG model, the time 
to run the MHP algorithm on this model, and the time to run the reachability analysis on 
this model. Both approaches are implemented in Java. For our experiments, we used a 
Symantec JIT compiler for JDK 1 . 1 on a workstation equipped with a 400 MHz Pentium 
II processor and 1 28Mb of memory, running Windows NT. 
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Figure 5 presents the raw data from running the MHP and reachability algorithms on 
our set of examples. In this figure, for each example Java program, the first column gives 
the name of the program; the second column gives the number of threads, including the 
main thread; the third column gives the overall number of nodes in the PEG model of 
the example; the fourth column gives the number of nodes that are used to model thread 
synchronizations, e.g., waiting nodes; the fifth column gives the number of node pairs 
found by the MHP algorithm; the sixth column gives the number of node pairs found 
by the MHP algorithm but not by the reachability analysis (thus, the number of node 
pairs found by the reachability analysis can be found by subtracting the number in the 
sixth column from the number in the fifth column); finally, the seventh, eighth, and ninth 
columns show the time in seconds taken to construct the PEG for the example and to 
run the MHP and reachability algorithms respectively. 

Out of the 29 example programs, the MHP algorithm was less precise than the 
reachability algorithm on only two examples, CHAN_OF JNT and SplitRendererNested. 
In both of these cases the number of spurious pairs was small compared to the total 
number of pairs of nodes that may happen in parallel (40 out of 934 and 50 out of 677 
respectively). 

The timing data indicate that in practice the MHP algorithm is very efficient. For all 
examples, except the AutomatedBanking and PessimBankAccount examples, running 
the MHP algorithm took under 0.3 second. For all but the simplest examples, running 
the MHP algorithm took much less time than running the reachability analysis. In fact, 
for most examples it took more time to construct the PEG model than it took to run the 
MHP algorithm. 

5 Conclusions 

Information about which pairs of statements may execute in parallel has important appli- 
cations in optimization, detection of anomalies such as race conditions, and improving 
the accuracy of data flow analysis. Efficient and precise algorithms for computing this 
information are therefore of considerable value. In this paper, we have described a data 
flow method for computing a conservative approximation of the set of pairs of state- 
ments in a concurrent Java program that may execute in parallel. Our algorithm has a 
worst-case bound that is cubic in the number of statements in the program. 

We carried out an initial experiment evaluating the precision of our algorithm against 
the precision of a technique based on exhaustive exploration of the program state space. 
Since this reachability technique, which is exponential in the program size, is not practical 
in general, we restricted the size of our example programs to those for which we could 
compute the “ideally” precise MHP information. On 27 of the 29 example programs, 
the MHP algorithm produced as precise results as the reachability analysis. 

In the future, we plan to improve the applicability of the MHP algorithm by elimi- 
nating the use of inlining in constructing the program model. Even in its current form, 
the MHP algorithm does not require inlining methods that do not contain thread syn- 
chronizations. Such methods calls may be represented in the PEG for the program with 
a single node, where MHP information computed for this node is sufficient to determine 
MHP information for all nodes in the corresponding method. Thus, if n is a call node 
for method M, then any node in the body of M may happen in parallel with any node 
that may happen in parallel with a node representing the call to M. Special care must be 
taken when there is a possibility that a method may be called by more than one thread, in 
which case executions of multiple instances of this method may overlap in time. In this 
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case, unlike thread nodes, MHP information for the nodes from this method will contain 
other nodes from the same method. To determine whether this might happen, we have 
to check whether any of the call nodes to M is in the MHP set of any of the other call 
nodes to this method (this has to be done recursively for nested method calls), in which 
case the MHP sets of all nodes in M must contain all nodes in M. 

In the case of methods containing thread synchronization mechanisms, we plan to use 
a context-sensitive approach, extending the PEG model to include method call and return 
edges, similar to the approach of [8], and modifying the MHP algorithm accordingly. 

The run time performance of the MHP algorithm can benefit from optimizations of 
the PEG model. For example, node coarsening approaches, such as described in [13, 15], 
replace sequential regions of code that have no interaction with other threads by a single 
node. The resulting reduction in the number of nodes in the parallel execution graph 
should improve the performance of the MHP algorithm. 

At present, the MHP algorithm is being used as a part of the FLAVERS/Java tool [19] 
for data flow-based verification of application-specific properties of concurrent Java pro- 
grams. The program model used by FLAYERS represents the possibility of interleaving 
of events from different threads by edges of a special kind. We use the results of the 
MHP algorithm for computing such edges by creating an edge from node n to node m 
if m is placed in set OUT (n) by the MHP algorithm. Using the MHP algorithm results 
in a more precise model of concurrent execution. We plan to measure the impact of the 
precision improvements obtained and overheads incurred by using the MHP algorithm 
on data flow algorithms used in verification of concurrent Java programs. 
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Abstract. Usually, programming languages are used according to con- 
ventions and rules. Although general rules can be enforced by lint-like 
tools, there is a large class of rules that cannot be built into such tools be- 
cause they result from particular design decisions or the reuse of existing 
software. This paper presents a system, called CoffeeStrainer, that stati- 
cally checks programmer-specified constraints on Java programs. Unlike 
previous approaches, which only support constraints that apply to def- 
initions of types, CoffeeStrainer additionally supports a second class of 
constraints which apply to all uses of a type. Both classes of constraints 
play an important role for object-oriented class libraries and frameworks, 
which often make assumptions on their correct use. 



1 Introduction 

It is generally desirable to detect program errors as early as possible during soft- 
ware development. Statically typed languages allow many errors to be detected 
at compile-time. However, many problems that could be detected statically can- 
not be expressed using today’s type systems. In fact, in any reasonably sized 
software development project, conventions and rules constraining the structure 
of the application under development must be obeyed by the programmers, rang- 
ing from simple coding conventions to design constraints caused by, e.g., using 
design patterns. Most of these constraints could be enforced at compile-time, but 
there exists little support for checking them statically. Although general rules 
can be enforced by lint [7] and similar tools, there is a large class of rules that 
cannot be built into such tools because they result from particular design deci- 
sions or the reuse of existing software. Thus, tool support for statically checkable 
programmer-defined constraints is needed, which should be both expressive and 
useable for everyday programmers. 

This paper presents a system, called CoffeeStrainer, that statically checks 
programmer-specified constraints on Java programs. From the perspective of a 
framework developer, it allows to specify rules for using the framework correctly; 
from the perspective of framework users, it is a tool that can detect incorrect 
uses or specializations of a framework. Rather than inventing a new constraint 
language, it takes a pragmatic approach for specifying constraints in that Java is 
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used as a constraint language as well; thus, using CoffeeStrainer does not require 
learning new syntax. 

Compared with previous work on specifying implementation or design con- 
straints for object-oriented programs [2, 3, 4, 8, 9], CoffeeStrainer is different in the 
following aspects: 

— Instead of dehning a new special-purpose language, constraints are specified 
in Java, a language the programmer already knows; 

— The system is implemented as an open object-oriented framework that ex- 
ecutes programmer-defined constraint code at compile-time; it can be ex- 
tended and modified by defining new object-oriented abstractions that are 
used by the constraint code; 

— The constraint code and the base-level code share the same structure, making 
it easy to find the rules that apply to a given part of the program; 

— The constraint code is embedded in special comments, leaving syntax and 
semantics of Java programs unchanged; thus, arbitrary compilers and other 
tools can operate on the source code; 

— When defining a new rule, the programmer has access to a complete abstract 
syntax tree (AST) of the program that is to be checked; thus, constraints 
are not restricted to the signatures of classes and methods; 

— Special support is provided not only for constraining definitions of types, 
but additionally for constraining the usage of classes and interfaces. 

Following [2J, there arc three categories of constraints: stylistic constraints arc 
concerned with names and other aspects of a program that, when changed, 
do not affect its semantics; implementation constraints deal with problematic 
language constructs or cover common traps and pitfalls that may easily lead 
to subtle programming errors; and design constraints reflect programming rules 
for the correct use of a framework, or coding conventions resulting from the use 
of design patterns. In the remainder of this section, we will list qiiite a large 
mimber of examples for such constraints, both because we want to convey the 
broad scope of constraints that can be specified and checked with CoffeeStrainer, 
and because we want to show that such constraints are ubiquitous in any software 
development project. 

Examples for stylistic constraints that can be specified with our system and 
checked at compile-time arc: 

— Package names should be lowercase only; 

— In a class definition, the declarations of public variables, public constructors, 
and public methods should precede the private declarations; 

— The scope of local variables should be minimal, i.e. a variable declaration 
should be in the smallest block that contains all uses of the variable. 

Examples for implementation constraints are: 

— A class that provides its own implementation of the method public boo- 
lean equals (Obj ect other) should also implement the method 
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public int hashCode () and vice-versa, because equal objects must have 
the same hash code to be correctly added to and removed from hash-based 
collections; 

— Branches of an if -statement should be blocks rather than single statements, 
because when adding a new statement to a single-statement branch, pro- 
grammers often forget to correctly group both statements in a block; 

— String objects should be compared using the method equals ( ) rather than 
the identity operator 

Design constraints can be classified further into coding conventions which may 
help producing code that is easier to comprehend and maintain; inheritance 
constraints that specify a contract between a class and its subclasses, sometimes 
called inheritance contracts or reuse contracts [12]; and usage constraints that 
constrain how objects of a certain type may be used. 

Examples for coding conventions that can be specified using CoffeeStrainer arc; 

— All instance variables (fields) should have private access only, and accessor 
methods should be provided that have no other side effects; 

— When using Java RMI, classes that implement the interface j ava . net . Re- 
mote should not be used in variable or field declarations; instead, interfaces 
derived from Remote should be used, such that remote objects can always 
be substituted for local objects; 

— Classes from vendor- or platform-specific packages, such as sun . * or com . mi - 
crosof t . *, should not be used because the system under development will 
be certified as 100% pure Java. 

Inheritance constraints include for example: 

— All methods in subclasses of a class DBOb j ect overriding the method 
register ( ) should call super . register ( ) before doing anything else; 

— In a method template ( ) of an abstract class, the contained call to a 
method “thi s . hook ( ) ” should not be removed, because subclasses rely 
on the call to hook ( ) (see [11] for a definition of template and hook meth- 
ods, a design idiom used in many design patterns); 

— When using the Visitor pattern [6], classes that implement the interface 
Visitable should implement the method public void accept 
(Visitor v) by calling back the visitor object using a type-specific method 
public visit < type > (<type> o) , without doing anything else. 

Examples of usage constraints are: 

— A specific framework method that for certain reasons needs to have public 
access (e.g., a public constructor required for object serialization) should 
not be called by user-level code; 

— Fields, variables, parameters, and result values of a certain type (e.g., an 
enumeration type) may not contain the value null, i.e., only non- null values 
should be used for initializing or assigning to fields and variables, for binding 
to parameters, and for returning from methods; 
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— By implementing one of the empty interfaces Layer 1, Layer2, Layers, . . . , 
a class can be marked as belonging to a certain architectural layer; classes 
should only call methods of classes that belong to the same layer or the layer 
immediately below. 

The remainder of the paper is organized as follows. Section 2 explains Cof- 
feeStrainer from an abstract perspective. Sections 3 and 4, using four example 
constraints of increasing complexity, explain how constraints are specified in 
CoffeeStrainer. Section 5 compares our system with related work, and section 6 
draws conclusions and points out directions for future work. 

2 Specifying Constraints with CoffeeStrainer 

CoffeeStrainer builds an abstract syntax tree (AST) of the program against 
which constraints are to be checked. The AST is complete, i.e., method bodies 
consisting of statements and expressions are represented, and the AST is aug- 
mented with information from name analysis (associating each use of a name 
with its declaration) and type analysis (associating each expression with its static 
type). All classes and methods that are supported by CoffeeStrainer’s AST are 
listed in the appendix of this paper. 

Sec figure 1 for an example program together with its AST. The program 
consists of two classes A and B in a package p. Both classes contain one field 
declaration; field f in class A, and field g in class B, which has an initializer 
expression. The "tree” part of the AST consists of the nodes and the solid lines, 
while additional references obtained from name and type analysis are shown 
using dotted arrows. There are dotted arrows from both node A and node B to 
node Ob j ect because both classes extend the root class j ava . lang . Ob j ect. 
(The AST nodes for the package java, lang and for the methods of Object 
are omitted from the diagram.) 




Fig. 1. An example abstract syntax tree (AST) 
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Given an abstract syntax tree, a single constraint can be thought of as a 
function boolean check (AST a) that returns true iff the argument AST 
a represents a program that satisfies the constraint. Obviously, this simple ap- 
proach has a number of drawbacks: 

It docs not allow conciseness, because usually only small parts of the AST 
matter for each constraint, and it is tedious to programmatically traverse the 
AST until the interesting part is found. Although a special-purpose language - 
providing, for example, programming techniques like pattern matching known 
from functional languages - could help traversing the AST, we do not want to 
invent a new programming language for CoffeeStrainer. Rather, we consider the 
issue of language design orthogonal to the issue of providing a convenient way 
of traversing and accessing the AST of a program, because advanced language 
features like pattern matching certainly make sense not only for a constraint 
language, but for programming languages like Java in general. 

As a second drawback, the simple approach is not modular. From the per- 
spective of the programmer whose program is to be checked against a number 
of constraints, it would be difficult to find those constraints that apply to his 
particular code, if each constraint was applied globally. From the perspective 
of a constraint designer, there would be no easy way to rouse parts of existing 
constraints, or to extend and refine constraints incrementally. 

Third, the simple approach is not ejjicienl, because the traversal needed for 
each constraint would be hard-coded into each constraint, in the worst case lead- 
ing to one complete traversal of the AST for each constraint. Again, a special- 
purpose language could help if it allowed factoring out the traversal code from 
each constraint. However, we do not consider inventing a new language as a real- 
istic option because using CoffeeStrainer should require as little effort as possible 
from programmers who already know Java. For these reasons, constraints in Cof- 
feeStrainer are specified as follows: 

Instead of a single entry point - the complete AST - CoffeeStrainer supports 
multiple entry points, called eonstraint methods, that apply to a specific AST 
node type, such as for example. Class, Interface, Field, Block, While, 
Assignment, etc. These constraint functions are written as Java methods that 
have a single parameter - of an AST node type - and return true iff the con- 
straint is satisfied. By convention, the names of constraint methods are prefixed 
with check, as in checkField ( ) , checkParameter ( ) , etc. The traversal 
of the AST is taken care of by CoffeeStrainer, which for each visited node T 
calls constraint methods check<T> ( ) and outputs a warning if the constraint 
method returns false. Thus, only one traversal of the AST is needed, avoiding the 
above-mentioned efficiency problems. Note that it is still possible to specify con- 
straints from a global perspective, e.g. by writing a method checkClass ( ) , and 
traversing the AST explicitly. However, experience with CoffeeStrainer showed 
that this is rarely necessary because most constraints can be specified as a con- 
junction of several simple and local constraint methods. 

In order to make constraints modular, and to make the scope of each con- 
straint explicit, constraint methods are embedded inside special comments within 
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the checked program’s source code. A constraint that appears in a class or inter- 
face definition applies to the AST of the class or interface itself and to all of its 
subtypes. For example, a constraint method checkField ( ) that appears in a 
special comment within the code of a class A applies to the AST of A and to all 
ASTs of A’s subclasses. By defining additional constraint methods in a subclass 
B of A, the constraints that applied to A can be extended and refined for B and 
its subclasses. However, it is not possible to weaken constraints in subtypes; a 
constraint method that is placed in class A cannot be overridden in class B. If 
a constraint method is defined both in class A and class B, A’s constraint will 
apply to A and all subtypes of A (including B), and additionally, B’s constraint 
will apply to B and all subtypes of B. 

Technically, CoffeeStrainer extracts special comments (comments starting 
and ending with the character sequence and respectively) and in- 

serts the contained constraint methods into newly generated constraint classes, 
which are then compiled on the fly, and dynamically loaded into the CoffeeStrainer 
system. Inheritance of normal Java classes is not reflected on the constraint 
class level, which is why overriding of constraint methods is not possible. During 
traversal of the AST, the constraint methods of the constraint classes are called 
according to the Visitor design pattern [6] whenever their signature matches 
the type of the currently visited AST node, and warnings are printed for each 
constraint method invocation that returns false. 

While the scheme described so far allows to specify many constraints in a 
convenient way, experience showed that an important class of constraints does 
not fit into this scheme. The constraint methods described so far apply to the def- 
inition of a class or interface and its subtypes, and arc therefore called definition 
constraint methods. However, some constraints can bo specified more naturally 
from the perspective of the uses of the type. Therefore, CoffeeStrainer supports 
another type of constraint methods, called usage constraint methods. 

Types (classes and interfaces) can be used in two ways: either by explicitly 
naming the type, as for example in field declarations, instanceof expressions, 
and object allocations, etc., or in the form of values of the type, i.e., if an 
expression has the type as its static type. In the former case, the AST node in 
which the type name occurs is the main point of interest. In the latter case, it 
is not the expression itself, but the parent node of the expression which is the 
main point of interest. 

For example, consider the statement “return new A ( ) ; ” . Here, the class A 
is used in two ways: First, by explicitly naming it in the object allocation expres- 
sion, and second, by returning a value of type A. The usage constraint methods 
are called checkUseAtReturn ( ) and checkUseAtOb j ectAllocation ( ) , 
accordingly. In the following, the constraint methods that will be called during 
the traversal of the example program of figure 1 are listed (assuming that all 
these constraint methods arc defined inside special comments): 

class A; constraintClassFor (A) . checkClass (A) ; 

constraintClassFor (A) . checkField (f) ; 
constraintClassFor (B) . checkUseAtField ( f ) ; 
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class B; constraintClassFor (B) . checkClass (B) ; 

constraintClassFor (B) . checkField (g) ; 
constraintClassFor (A) . checkUseAtField (g) ; 
constraintClassFor (A) . checkUseAtFieldInitializer (g) ; 
constraintClassFor (A) . checkUseAtObj ectAllocation (o) ; 

The last three calls regard different uses of A: first, as the declared type of field g; 
second, as the static type of the field initializer, and third, as the class referred 
to by name in the object allocation expression. If a type is used at a certain 
AST node in more than one way, the corresponding usage constraint meth- 
ods arc distinguished by appending additional specifiers to their names, as in 
CheckUseAtFieldInitializer () . Note that usage constraint methods ap- 
ply to every use of a certain type in a program that is checked by CoffeeStrainer: 
While checking class A, one usage constraint method of B is called because B is 
used once in class A. Similarly, three usage constraint methods defined in class A 
are called while checking class B, because class B uses A in three different ways. 



3 Definition Constraints 

In this section, wc present two examples of definition constraints. For each con- 
straint, first, a more detailed description and motivation is given. Second, the 
constraint is specified using CoffeeStrainer, and it is explained how this con- 
straint specification is used to check the constraint on Java programs. 



3.1 Private Access for Fields 

For proper encapsulation, a class should declare all fields (instance variables) 
with private access only. This constraint leads to programs that are more 
maintainable, since the internal representation of an object’s state can be changed 
without requiring changes to all users of that object. When fields are declared 
with private access, even subclasses of a class cannot access the fields directly, 
such that implementation changes in a base class need not lead to changes in 
derived classes. In figure 2, both the source code and the corresponding abstract 
syntax tree of an example class Student is shown. 

To the left of the source code, the AST nodes and their types are shown. A 
node of type Package is at the root of the tree. The only child of that node is 
a node of type Class, representing the class Student. This node then has two 
children of type Field, for id and name, respectively. Modifiers like private, 
protected, or public are stored as attributes of the Field nodes. In this 
diagram, there are three dotted arrows that represent information from name 
analysis. One arrow is from node Student to the node representing the class 
j ava . lang . Obj ect, since Student inherits from Object. Another arrow 
is from node name to the node representing the class j ava . lang . String, a 
reference to the declared type of name. The third arrow is from node id to a 
predefined node representing the primitive type int. 
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string : Class^ 



package school; 

public class Student { 
private int id; 
String name; 

} 



Fig. 2. Student example 



Checking whether class Student only has private fields, then, amounts 
to checking whether each field in class Student is declared as private. As 
the constraint is not appropriate for all classes (for example, in performance- 
critical applications, the additional indirection might be prohibitive), we define 
an empty interface called PrivateFields and require private access for 
fields only for classes that implement PrivateFields by implementing the 
definition constraint method checkFieldO (set in bold face): 

public interface PrivateFields { 

/*- public boolean checkField (Field f) { 

rationale = "all fields must be private"; 
return f . isPrivate ( ) ; 

} 

-*/ 

} 

To make the constraint apply to our sample class Student, wc need to 
change the class to implement PrivateFields. Since this interface is empty, 
as seen from the perspective of a normal Java compiler, the subtype relationship 
can be declared without making additional changes to Student, such as e.g. 
implementing additional methods. When using CoffeeStrainer, it is a general 
technique to use empty interfaces for defining constraints, and then using these 
interfaces to mark classes for which the constraint should be enforced. Such 
marker interfaces are useful in other contexts as well; for example, the standard 
interface Serializable is such a marker interface that marks classes whose 
objects may be serialized and stored in a file or sent over the network. 

The steps taken by CoffeeStrainer for checking the example class Student are 
the following; 

— The AST for Student is built and then augmented with name and type 
analysis information. During name and type analysis, classes and interfaces 
that are referenced by Student need to be parsed. 

— Eventually, the interface PrivateFields will be parsed, and the special 
comment will be detected. From the code contained in that comment, a new 
constraint class meta . PrivateFields will be generated, compiled on the 
fly and loaded dynamically. 
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— After all necessary files have been parsed, the actual checking will be per- 
formed: For both nodes id and name of type Field, the method 
checkFieldO will be called by the CoffccStrainer framework, providing 
id and name as an argument, respectively. 

— The method checkFieldO contains a rationale for the constraint. This 
rationale is passed to CoffeeStrainer by assigning it to the predefined string 
variable rationale. This step is optional, but makes the messages that arc 
output for violations of the constraint easier to understand for the program- 
mer. 

— Then, if method checkFieldO does not return true for a node of type 
Field, which is the case for name, a field with package-local accessibility, 
a warning message will be generated. This warning message consists of the 
rationale, the name of the class which specified the constraint and in- 
formation about the construct that violates the constraint (file name, line 
number, and source code): 

$ java coffeestrainer .Main school . Student 

conventions . PrivateFields does not allow Field "name" 

(because all fields must be private) 
in file school /Student . j ava , line 5 

3.2 Call Overridden Method from Subclass 

Often, a superclass defines methods that can (and should) be overridden in its 
subclasses. Sometimes, overriding methods are expected to call the overridden 
method before doing anything else. For instance, consider a class MediaStream 
with a method initialize { ) that performs necessary initializations that can- 
not be performed in the constructor. A subclass of MediaStream that requires 
additional initialization actions should define initialize { ) in such a way that 
the call of super . initialize ( ) precedes all subclass-specific initializations. 

See figure 3 for the source code and AST of the example. The class Media - 
Stream only contains the method initialize ( ) , which is represented in the 
AST by a node of type ConcreteMethod. A ConcreteMethod node contains 
a node of type Block, whereas a node of type AbstractMethod does not. 
In class MediaStream, the body of initialize () is an empty block. Class 
AudioStream contains a method initialize ( ) as well. This method’s body 
is not empty; it contains an expression statement with an instance method call 
that calls initialized on super. The pseudo-variables this and super 
are represented by a node of type This; they differ in the return value of 
isSuper {) . Note that this is information that is added to the AST by name 
and type analysis. There are three other references in this example that are set 
up by name and type analysis: the node representing AudioStream references 
its superclass, the node of type InstanceMethodCall references the called 
method, and the node of type ConcreteMethod representing initialize () 
in AudioStream references the overridden method. 

The desired constraint can be written as follows: 
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^AudioStream : Class ^ 

{^initialize : ConcreteMethod 
tody : Block~^ 



s : ExpressionStatement 



call : InstanceMethodCall^ 
super : This 



package streams; 

public abstract class MediaStream { 

public void initialize () { 

/* init code */ 

} 

} 

public class AudioStream 

extends MediaStream { 

public void initialize () { 



super . initialize ( ) ; 
/* init audio */ 




Fig. 3. Streams example 



public abstract class MediaStream { 

(01) public void initialize () { /* initialization code */ } 

( 02 ) /*- 

(03) private AMethod initializeMethod ( ) { 

(04) return Naming . getInstanceMethod (thisClass , 

(05) "initialize", new AType [0] ) ; 

(06) } 

(07) private boolean overrides (AMethod ml, AMethod m2) { 

(08) if(ml==null) return false; 

(09) if (ml . getOverriddenMethod ( ) ==m2 ) return true; 

(10) else return overrides (ml . getOverriddenMethod () , m2); 

( 11 ) } 

(12) private AStatement getFirstStatement (ConcreteMethod m) { 

(13) AStatementList si = m . getBody ( ) . getStatements ( ) ; 

(14) return sl.size()>0 ? si. get (0) : null; 

(15) } 

(16) private boolean callsinitialize (AStatement s) { 

(17) if(!(s instanceof ExpressionStatement)) return false; 

(18) AExpression e= ( (ExpressionStatement) s) . getExpression () ; 

(19) if(!(e instanceof InstanceMethodCall) ) return false; 

(20) AMethod called= (InstanceMethodCall) e) . getCalledMethod 0 ; 

(21) return called == initializeMethod ( ) 

(22) II overrides (called, initializeMethod); 

(23) } 

(24) public boolean checkConcreteMethod (ConcreteMethod m) { 

(25) rationale = "when overriding initialize, " + 

(26) "super . initialize ( ) must be the first statement"; 

(27) return implies (overrides (m, initializeMethod ()) , 

(28) callsinitialize (getFirstStatement (m) ) ) ; 

(29) } 

(30) -*/ } 
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From the special comments, CoffeeStrainer will generate a constraint class meta- 
. MediaStream that contains the constraint code contained in MediaStream. 
After loading the generated class, the method checkConcreteMethod ( ) (line 
24) will be called for every non-abstract method contained in MediaStream 
or any of its subclasses. In this method, the rationale string is set, and then, 
a boolean expression is returned. This boolean expression makes use of some 
helper functions: 

— implies (bl, b2) (used in line 27) is a predefined method which returns 
true if its first argument bl is false or its second argument b2 is true. 

— initializeMethod ( ) (line 3) returns the AST node object represent- 
ing the base-level method initialize ( ) . It uses the static helper method 
Naming . getInstanceMethod (c , n, t ), which returns an AST method 
object for the instance method of class c with name n and parameter types t. 
The static field thisClass is defined in every generated class and contains 
the AST class node that represents the base-level class, which in this case is 
the AST node representing MediaStream. In meta-level classes generated 
from interfaces, this field is called thisinterf ace. 

— overrides (ml, m2) (line 7) returns true if ml is a method that over- 
rides a method m2 defined in a super-class, ml may be null, in which case 
overrides returns false. Otherwise, if the method which is overridden 
by ml is identical to m2, true is returned. If this is not the case, overrides 
makes a recursive call to check whether ml overrides m2 transitively. Note 
that the expression ml . getOverriddenMethod ( ) may be null, in which 
case the recursive call returns false immediately. 

— getFirstStatement (m) (line 12) returns the first statement of method 
m, or null if there is no first statement. 

— callsinitialize (s) (line 16) returns true if the statement s exists, and 
if it is an expression statement which contains an instance method call that 
calls the initializeMethod ( ) or a method overriding it. 

Note that it is possible to employ all object-oriented structuring mechanisms for 
factoring out common code, and for making complex constraints more declara- 
tive. In our case, by using a number of helper methods, the top-level constraint 
which is used in checkConcreteMethod ( ) can be read as a declarative spec- 
ification of the constraint which we described in the beginning of this section. 

4 Usage Constraints 

In this section, we present two examples of usage constraints. The description is 
structured like in the previous two examples. 

4.1 Disallow Object Identity Comparisons for a Certain Type 

In object-oriented languages, objects have an identity which does not change over 
the object’s lifetime, whereas the objects’ states may change. Accordingly, one 
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can distinguish between object identity (checking whether two object references 
refer to the same object using “==”) and object equality (checking whether the 
objects referred to by ol and o2 are equal using ol . equals (o2) ), the latter 
of which is usually implemented differently for each class, usually based on the 
current object’s state in comparison with the other object’s state. 

Often, if object equality for objects of some class is defined (by implementing 
equals ( ) ), object identity should not be used by clients of this class. However, 
inexperienced programmers sometimes arc not aware that there is a difference 
between object identity and object equality, and use object identity even for ob- 
jects of classes that should only be compared using object equality. One example 
of such a class is j ava . lang . String. It is possible that two object references 
refer to two different string objects that therefore are not identical; but these 
two objects may be equal because they contain the same character sequence. 

Similar to PrivateFields (see section 3.1), we define an empty interface 
Noldentity which captures the constraint that objects whose classes imple- 
ment Noldentity may not be compared using object identity. We do allow, 
however, comparing object references against the object reference null. 

public interface Noldentity { 

(01) /*-protected boolean isNull (AExpression e) { 

(02) if(e instanceof Literal) { 

(03) Literal 1 = (Literal) e; 

(04) if (1 . constantValue ( ) == null) return true; 

(05) } 

(06) return false; 

(07) } 

(08) public boolean checkUseAtBinaryOperation ( 

(09) BinaryOperation bo) { 

(10) rationale = "objects of this type may not " + 

(11) "be compared using == or !="; 

(12) return isNull (bo . getLef tOperand ( ) ) 

(13) II isNull (bo . getRightOperand 0) ; 

(14) } 

(15) -*/ 

} 

In the previous two examples, the checks were always concerned with language 
constructs that should or should not appear in a class (or interface) and all its 
descendants. This time, we are concerned with the correct use of a type. For this 
purpose, we have implemented a usage constraint method 
CheckUseAtBinaryOperation 0 , which is called whenever an object ref- 
erence of type Noldentity is used in the context of a comparison (i.e., in a 
BinaryOperation where the operator is “==” or other binary opera- 

tions are not defined for object types). Note that this usage constraint method 
is called whenever Noldentity or one of its subtypes is used in a reference 
comparison, and thus may apply globally if every class that is checked by Cof- 
feeStrainer used Noldentity or one of its subtypes. 
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The implementation of checkUseAtBinaryOperation reflects the con- 
straint that using an object whose class implements Noldentity is allowed 
only if the left operand or the right operand of the comparison is the base- 
level literal null. The helper method isNull (e) (line 1) returns true if the 
expression e is the literal null. 



4.2 Disallow null Value for a Certain Type 

Because object types in Java are reference types (as opposed to value types like 
int, float, etc.), the value null is a valid value for all fields, variables, pa- 
rameters, and method results of reference types. Sometimes, for example when 
defining enumeration classes, the value null should not be used for fields, vari- 
ables, etc. of that enumeration type. In an empty interface NonNull with which 
enumeration classes can be marked, we can define a constraint that specifies that 
only non-null values should be used for initializing or assigning to fields and vari- 
ables, for binding to parameters, and for returning from methods. Additionally, 
by deriving NonNull from Noldentity of the previous example, we inherit 
the constraint that values of this type can only be compared using equals ( ) . 

public interface NonNull extends Noldentity { 

(01) /*-protected boolean exists (AExpression e) { 

(02) return e!=null; 

(03) } 

(04) protected boolean isNull (Aexpression e) { 

(05) return Noldentity . isNull (e) ; 

(06) } 

(07) protected boolean isNonNull (Aexpression e) { 

(08) return exists (e) && ! isNull (e); 

(09) } 

(10) public boolean checkUseAtField (Field f) { 

(11) rationale = "field needs non-null initializer"; 

(12) return isNonNull ( f . getinitializer ()) ; 

(13) } 

(14) public boolean checkUseAtLocalVariable (LocalVariable v) { 

(15) rationale = "variable needs non-null initializer"; 

(16) return isNonNull (v . getinitializer ()) ; 

(17) } 

(18) public boolean checkUseAtReturn (Return r) { 

(19) rationale = "method may not return null"; 

(20) return isNonNull (r . getFxpression ()) ; 

( 21 ) } 

(22) public boolean checkUseAtAssignment (Assignment a) { 

(23) rationale = "assignment may not assign null"; 

(24) return ! isNull (a . getOperand ()) ; 

(25) } 

(26) public boolean checkUseAtMethodCallParameter ( 

(27) int index, AMethodCall me) { 

(28) rationale = "method call argument may not be null"; 
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(29) return ! isNull (me . getArguments (). get (index) ) ; 

(30) } 

(31) public boolean checkUseAtCast (Cast c) { 

(32) rationale = "downcast is not allowed"; 

(33) return false; 

(34) } 

(35) public boolean checkUseAtConditionallf True ( 

(36) Conditional c) { 

(37) rationale = "hiding null in conditional expression" + 

(38) " is not allowed"; 

(39) return ! isNull (c . getIfTrue ( ) ) ; 

(40) } 

(41) public boolean checkUseAtConditionallf False ( 

(42) Conditional c) { 

(43) rationale = "hiding null in conditional expression" + 

(44) " is not allowed"; 

(45) return ! isNull (c . getif False ()) ; 

(46) } 

(47) -*/ 

} 

Note that although the example might seem long, it is complete and deals with 
Java and not a toy language. We have defined three helper methods: exists ( ) 
(line 1) returns true if the argument expression exists (i.e., is not null). The 
method isNull () (lines 7) reuses isNull () which has been defined for the 
previous example. Note that this method is not inherited, because base-level 
inheritance is not reflected by inheritance at the meta-level, as both base-level 
interfaces and classes have corresponding meta-level classes, and thus, multiple 
inheritance would be required. The method isNonNull () returns true if its 
argument expression e exists, and is not the base-level literal null. 

The remaining usage constraint methods check that such expressions are 
never used for initializing fields or variables, for returning from a method, as 
the right hand side of an assignment, or passed as argument of a method call, 
respectively. As a result, code that deals with classes that implement NonNull 
never has to check for null values! 

5 Related Work 

CoffccStraincr is a framework for specifying constraints about object-oriented 
programs that arc checked at compile-time. Three systems that are very similar 
to CoffeeStrainer will be discussed according to the key points of the discussion 
from section 2, the criteria completeness, conciseness, modularity, and efficiency. 

5.1 Other Systems 

There are five systems which are similar to CoffeeStrainer. 
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— GENOA [4], a customizable code analyzer which can be interfaced to existing 
language front-ends, provides a LISP-like query language that applies to the 
complete AST of a program under examination. 

— The CH — h Constraint Expression Language - CCEL - [2] allows to specify 
statically checked constraints on programs written in C+-I-. Constraints are 
specified in a special-purpose language and can be checked globally, for a 
single file, for a class, or for a function. 

— Law-Governed Architecture - LGA - [9] supports constraints specified on 
a language-independent object model, using Prolog as the constraint lan- 
guage. A mapping of the abstract object model to Eiffel has been defined 
and implemented [10]. LGA can specify more than just statically checkable 
constraints on object-oriented programs. Its scope extends on one side to 
constraints that can be defined on the software development process, and on 
the other side to constraints that can only be checked at runtime. In this 
comparison, we will consider the statically checkable subset of LGA only. 

— The Category Description Language - CDL - which has been proposed for 
specifying Formal Design Constraints [8], is a constraint language based on 
a theory of logics on parse trees. CDL is a restricted formalism that allows 
to check automatically whether a set of constraints is consistent. 

— ASTLOG [3], a language for examining abstract syntax trees, is a variant 
of Prolog win which the clauses have access to a complete AST of C+-|- 
programs. In ASTLOG, constraints are written in an inside-out functional 
style that is particularly suited for analyzing tree structures. 

While CoffeeStrainer, GENOA, CDL, and ASTLOG operate on complete ab- 
stract syntax trees, i.e. including method bodies with statements and expres- 
sions, both CCEL and LGA provide only partial representations of the parsed 
program. In CCEL, constraints have access only to top-level declarations, namely 
to class declarations, signatures of methods, and field declarations. Constraints 
in LGA are restricted to operations defined on objects in the abstract object 
model - object creation and deletion, reading and writing object properties, and 
invoking methods on objects. Thus, constraints that refer to the control flow, 
to declarations of parameters and variables, and constraints that take language- 
specific features into account are not possible. Additionally, it should be noted 
that CDL provides only a simple parse tree without name and type analysis 
information. 

Regarding conciseness, GENOA, CCEL, LGA, and ASTLOG are better off 
beeause they use declarative languages for specifying constraints. To assess CDL 
in this respect is not easy, because it is a restricted language in which consistency 
of a set of constraints is statically decidable. As the authors of CDL note, some 
constraints cannot be expressed in CDL, and an extension or loophole would 
be needed - destroying the decidability property. For CoffeeStrainer, much care 
has been taken to make constraints as declarative as possible, but the fact that 
constraint methods are written in Java makes them less declarative and concise 
as they should be. However, CoffeeStrainer has the advantage that the program- 
mer need not learn a new language or syntax for specifying constraints - as in 
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the other systems - and thus makes it much more accessible for practitioners. 
For example, the language used for specifying constraints in CCEL, although 
very similar to the base-level language C+-|-, needs more than three pages of 
grammar description [5]. Moreover, constraints defined in ASTLOG are very 
difficult to read because of their higher-order and inside-out style. Because of 
its customizability, GENOA’S queries are not as concise as the other declarative 
systems: they seem to be very similar to what one would write in CoffeeStrainer. 

CoffeeStrainer allows to specify constraints that are modular, customizable 
and composable. Constraints can be extended and refined in subtypes, and well- 
known object-oriented structuring techniques can be used for constraints. By 
using interface inheritance, marker interfaces can be defined that combine several 
constraints. Because multiple inheritance is allowed for interfaces, this idiom 
is quite flexible. In CoffeeStrainer, constraints are associated with classes and 
interfaces, so that different compilation unitscan be checked separately. Most 
of these possibilities are not present at all, or limited in the other systems. 
GENOA, CCEL, LGA, and ASTLOG check all constraints globally, and a list 
of all constraints has to be searched when the programmer wants to find out 
what constraints apply to a given class. In GENOA and CCEL, there are no 
provisions for composing or extending constraints. LGA and ASTLOG support 
composition of constraints, because they use Prolog as the constraint language. 
Also, common constraint code can be factored out using intermediate Prolog 
rules, but in contrast to CoffeeStrainer this issue has to be considered in advance. 
In CDL, it is as easy as in CoffeeStrainer to find constraints that apply to a type, 
because language constructs are annotated with constraint names. However, this 
requires the base language’s syntax to be extended, which is not an option in 
most software development projects. 

Usage constraints cannot be specified in CDL at all, because constraints arc 
only applied to annotated constructs. CCEL and LGA deal only with restricted 
sets of usage constraints: CCEL only supports usage constraints based on ex- 
plicitly naming other types in class definitions, field declarations, and method 
signatures; and LGA only supports usage constraints based on methods that 
are invoked on objects, and constraints based on object allocations. Although 
the other systems do not snpport usage constraints, they can be written as a 
definition constraint that applies globally. Note that both example usage con- 
straints in section 4 could not be specified with any of the other systems in an 
appropriate way. 

Regarding efficiency, it seems that CDL is best, because it is based on a 
restricted formalism, followed by CCEL , GENOA, and CoffeeStrainer, which 
traverse the AST of the program to be checked exactly once and cannot apply 
the optimizations that are possible in CDL. LGA and ASTLOG, which are based 
on Prolog, snpport backtracking and unification and thus trade expressiveness 
for efficiency. 

Table 1 compares the systems that have been discussed. A “complete AST” 
is given if the AST covers all language constructs and includes semantic infor- 
mation from name and type analysis. One one side of the expressiveness scale. 
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ASTLOG’s scores very high because it has higher-order features. In CDL, on the 
other side, certain constraints cannot specified at all, but consistency of a set 
of constraints can be chocked automatically. CoffeeStrainer is the only system 
that allows modular checking and modular composition of constraints, and no 
other system has support for usage constraints comparable to CoffeeStrainer. 
The potential for optimizations is highest for CDL with its restricted formalism 
and lowest for ASTLOG with its higher-order capabilities. 



System 


language 


complete 

AST 


expressi- 

veness 


modular 


usage 

constraints 


efficiency 


GENOA 


generic 
/ C++ 


yes 


medium 


no 


no support 


high 


CCEL 


C-f-f 


no 


medium 


no 


no support 


high 


LGA 


generic 
/ Eiffel 


no 


high 


no 


restricted 

support 


medium 


GDL 


generic 


no 


low 


no 


not possible 


very high 


ASTLOG 


C-f-f 


yes 


very high 


no 


no support 


medium 


CoffeeStrainer 


Java 


yes 


medium 


yes 


yes 


high 



Table 1. Comparison of systems for static constraint checking 



6 Conclusions 

We have presented a system which allows to statically check structural con- 
straints on Java programs. Unlike previous work, CoffeeStrainer takes a prag- 
matic approach and does not define a new special-purpose language. Instead, 
constraints are specified using stylized Java, so that the programmer need not 
learn new syntax. The system is implemented as an open object-oriented frame- 
work for compile-time meta programming that can be extended and customized; 
constraints can be composed and refined. Constraint code and base-level code 
share the same structure by embedding constraint code in special comments, 
making it easy to find the rules that apply to a given part of the program, and 
allowing arbitrary compilers and tools to be applied to the source code that 
contains constraints. When defining a new rule, the programmer has access to a 
complete abstract syntax tree of the program that is to be checked; unlike other 
proposals, the meta model is not restricted to classes, methods and method 
calls, and special support is provided for constraining the usage of classes and 
interfaces. The framework has been fully implemented, and it is available at 
http : // WWW. inf . fu-berlin . de/~bokowski/ CoffeeStrainer. 

An area of further work is the issue of encoding properties of language con- 
structs. In this paper, we have used empty interfaces as “markers” that allow 
constraints to be specified applying only to marked classes and interfaces. We 
foresee that for some constraints, similar markers will be needed for methods, 
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fields, variables, parameters, etc. Although sometimes naming conventions might 
help (e.g., fields whose names begin with “shared-” should be accessed in syn- 
chronized methods only), in general, a new kind of special comments will be 
needed that can be used to annotate arbitrary language constructs. 
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Appendix: Overview of the 
AST Classes 

The classes and methods of the AST 
built by CoffeeStrainer are listed in a 
hierarchical way that reflects the in- 
heritance relationships between classes. 
See [1] for a complete deseription of 
these classes and methods. 

Package 
. getClasses ( ) 

. getinterf aces ( ) 



Field 

.named, qualif iedName ( ) 
.getType () 

. get Initializer ( ) 

AMethod 

.named, qualif iedName ( ) 
. getResultType ( ) 

. getParameters ( ) 

. getOverridenMethod ( ) 
AbstractMethod 
ConcreteMethod 
. getBody ( ) 

Constructor 
. getConstructorCall ( ) 



AType 

. isAssignableTo (AType) 

. isPassableTo (AType) 

. isCastableTo (AType) 
PrimitiveType 
. isBoolean ( ) , isByte ( ) , 

. isChar ( ) , isDouble ( ) , 
.isFloatd, isintd, 

• isLongd, isShortd 
ARef erenceType 
•isSubtypeOf (ARef erenceType) 

. getInstanceMethod (AType [] ) 

. getStaticMethod (AType [] ) 
Array 

. getElementType ( ) 

NullType 

AUserType 

.named, qualif iedName ( ) 

. getFields ( ) 

. getAbstractMethods ( ) 

. getStaticInitializers () 

. getNestedClasses ( ) 

. getNestedinterf aces ( ) 
Interface 

. getExtendedInterf aces ( ) 

. isSubinterfaceOf (Interface) 
Class 

. getimplementedinterf aces ( ) 

. getSuperclass ( ) 

. getConcreteMethods ( ) 

. getConstructors ( ) 

. getinstancelnitializers ( ) 

. isSubclassOf (Class) 



ConstructorCall 
. getCalledConstructor ( ) 

AVariable 

.named, qualif iedName ( ) 
.getType () 

Parameter 
Local Variable 
. getinitializer ( ) 

AExpression 
•getType () 

Literal 

. constant Value ( ) 

This 

. isSuper ( ) 
BinaryOperation 
. getLef tOperand ( ) 

. getRightOperand ( ) 
Conditional 
. getCondition ( ) 

. getIfTrue ( ) 

. getifFalse ( ) 

ALValue 

VariableAccess 
. getVariable ( ) 
ArrayAccess 
. get Array ( ) 

. getExpression ( ) 
AFieldAccess 
. getField ( ) 
StaticFieldAccess 
InstanceFieldAccess 



. isImplementationOf (Interface) 



. getinstance ( ) 
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AOperandExpression 
. getOperand ( ) 
UnaryOperation 
.operatorO, isPostfixO 
Cast 

. getCastType ( ) 

Instanceof 
. getRef erenceType ( ) 
ParenExpression 
Assignment 
. getLValue ( ) 
AArgumentsExpression 
. get Arguments ( ) 
AMethodCall 
. getCalledMethod ( ) 
StaticMethodCall 
InstanceMethodCall 
. getinstance ( ) 

Obj ectAllocation 
. getCalledConstructor ( ) 
AnonymousAl location 
ArrayAl location 
. f reeDimensions ( ) 

. get Initializer ( ) 
Arrayinitializer 

AStatement 
EmptyStatement 
VariableDeclaration 
. getVariable ( ) 

Block 

. getStatements ( ) 
AStatementWithExpression 
. getExpression ( ) 
ExpressionStatement 
Return 
Throw 

Synchronized 
. getBlock ( ) 

If 

. getThenBranch ( ) 

. getElseBranch ( ) 

Switch 

. getBranches ( ) 
ALoopingStatement 
. getBody ( ) 

Do 

While 

For 

. getForInit ( ) 



. getUpdateExpression ( ) 
Continue 
. getTarget ( ) 

Break 

. getTarget ( ) 

Try 

. getBlock ( ) 

. getCatchClauses ( ) 

. getFinallyClause ( ) 
UserTypeDeclaration 
. getUserType ( ) 

AForInit 

ForInitDeclaration 
. getDeclarations ( ) 
ForInitExpression 
. getExpressions ( ) 

ASwitchBranch 
. getStatements ( ) 
CaseBranch 

. getConstantExpression ( ) 
Def aultBranch 

Catch 

. getParameter ( ) 

. getBlock ( ) 

Finally 
. getBlock ( ) 
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Abstract. CIP is a model-based software development method for embedded 
systems. The problem of constructing an embedded system is decomposed into 
a functional and a connection problem. The functional problem is solved by 
constructing a formal reactive behavioural model. A CIP model consists of 
concurrent clusters of synchronously cooperating extended state machines. The 
state machines of a cluster interact by multi-cast events. State machines of 
different clusters can communicate through asynchronous channels. The 
construction of CIP models is supported by the CIP Tool, a graphical modelling 
framework with code generators that transform CIP models into concurrently 
executable CIP components. The connection problem consists of connecting 
generated CIP components to the real environment. This problem is solved by 
means of techniques and tools adapted to the technology of the interface 
devices. Construction of a CIP model starts from the behaviour of the processes 
of the real environment, leading to an operational specification of the system 
behaviour in constructive steps. This approach allows stable interfaces of CIP 
components to be specified at an early stage, thus supporting concurrent 
development of their connection to the environment. 



1. Introduction 

The CIP method (Communicating Interacting Processes) presented in this paper is a 
formal software development method for embedded systems. By ‘embedded system’ 
we mean any computer system used to control a technical environment. Examples 
include highly automated devices, industrial robots and computer controlled 
production processes. 

CIP specifications are constructed with the CIP Tool^ [1], a modelling framework 
with verification functions and code generators that transform CIP models into exe- 
cutable software components. The method and its tool have been used in many real 
projects. The benefit of a rigorous problem-oriented approach is an important 



' CIP Tool® is a registered trademark. All graphic figures of model parts presented in this 
paper have been generated from models created with CIP Tool®. Associated textual 
descriptions such as condition definitions or condition allocations correspond to model 
reports produced automatically. 

O. Nierstrasz, M. Lemoine (Eds.): ESEC/FSE '99, LNCS 1687, pp. 375-392, 1999. 

© Springer-Verlag Berlin Heidelberg 1999 
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improvement of software quality, reflected by understandable system models, robust 
and reliable software products and a considerable reduction of maintenance costs. 

The starting point in the design of CIP was the JSD method (Jackson System 
Development) [2], adopting the real world oriented modelling paradigm of this 
approach. JSD treats dynamic information problems by means of concurrent 
sequential processes, simulating a part of the real world and producing requested 
information about it. CIP differs from JSD mainly in its modelling framework, which 
is based on synchronously cooperating extended state machines rather than on 
concurrent processes described by extended regular expressions (structograms). 

The CIP method is based on the following development concepts: 

Problem Decomposition. The usual purpose of behavioural models in embedded 
system development is to specify the functional system behaviour in subject-matter 
terms. Such models, independent of technical interface concerns, are often called 
essential models [3]. The connection of an implemented essential model to the real 
environment represents a problem in its own right, demanding tools and techniques 
adapted to the technology of the interface devices. 

Model-Based Operational Specification. The functional behaviour of a CIP system is 
specified by an operational model of cooperating extended state machines. 
‘Operational’ means that the model is formally executable [4]. CIP combines 
synchronous and asynchronous cooperation of system parts within the same model. 
Synchronous cooperation, well known from real-time description techniques like 
Statecharts [5], ESTEREL [6] or LUSTRE [7], is needed to model synchronous 
propagation of internal interactions. Asynchronous cooperation on the other hand, 
supported by parallel modelling languages like SDL [8], JSD [2] or ROOM [9], is 
necessary to express concurrency. 

Component-Based Construction by White Box Composition. CIP models are deve- 
loped with the CIP Tool [1], a framework of graphic and text editors supporting full 
coherence among various architectural and behavioural views. Models are constructed 
by creating, composing and linking model components such as processes, channels, 
messages, states and operations. Modelling by compositional construction perfectly 
supports the problem- oriented construction process of the method, providing more 
flexibility and intuition than language-based specification techniques. 

Component-Based Implementation by Black Box Composition. CIP models are 
transformed automatically into executable software components which are integrated 
on one hand with components connecting to the environment, on the other hand with 
system parts like technical data processing units or extensive algorithmic functions. 
The main goal of software component technology is usually to construct systems by 
means of reusable building blocks. Although reuse of embedded components is often 
not possible because of the specific behaviour of the particular environment, 
component composition is still of great value. Concurrent development and flexible 
system integration is easier when system parts are constructed as software components 
with stable interfaces. 

Environment-Oriented Development. The development of a functional behavioural 
model must start with a rigorous definition of the model boundary. The widely used 
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context schema of SDRTS [3] for example models the boundary by means of event 
and data flows to and from the system. Although such boundary models allow the 
development of the behavioural model to be based on a well-defined set of external 
interaction points, they fail to express any behavioural relationships with the 
environment. CIP starts the development by defining the set of valid interaction 
sequences by means of a behavioural context model. This approach supports the 
construction of robust and dependable systems because it incorporates a formal model 
of the environment behaviour. 

The main part of the paper starts in section 2 by describing how CIP tackles the 
embedded system problem. Section 3 explains the architectural and behavioural 
constructs used to build CIP models. Section 4 presents the environment-oriented 
development process of the CIP method, illustrated by a simple but complete example 
of a CIP model construction. Section 5 finally describes how generated software 
components are connected to the environment. 

2. CIP Application Area: Control Problems 

An embedded system is a computer system which senses and controls a number of 
external processes. The behaviour of the individual processes is partly autonomous 
and partly reactive. In the case of physical processes the behaviour can be deduced 
from physical properties. More complex processes are often already controlled by 
local microprocessors. An operator driven man-machine interface is another source of 
asynchronous influences on the system. 

Two main problems are encountered when an embedded system is developed. The 
first problem concerns the functionality of the system: to bring about the required 
behaviour the embedded system to be constructed must react in a specific way to 
subject-matter phenomena of the environment. The second problem concerns the 
connection of environment and embedded system: the subject-matter phenomena 
related in the functional problem solution must be detected and produced by 
monitoring and controlling specific interface devices like sensors and actuators. 

As a simple example we take an opening door where the door motor has to be 
turned off when the door is fully open. The function of the system is simply to 
produce the MotorOjf action when the Opened event occurs. However, neither the 
event Opened nor the action MotorOjf are directly shared with the embedded system. 
Instead the event Opened must be detected by means of a position sensor which is 
connected to the embedded system by a shared binary variable; and the action 
MotorOjf must be produced by setting a binary variable of the motor actuator 
appropriately. 

The CIP method is based on a complete separation of the functional and the 
connection problem. The functional problem is solved independently of the interface 
devices by specifying a rigorous behavioural model. The CIP model is constructed 
graphically with the CIP Tool and transformed automatically into concurrently 
executable CIP units. 

The construction of a CIP model is based on a virtual connection to the external 
processes. The virtual interface of the external processes consists of collections of 
events and actions designating instantaneous subject-matter phenomena occurring in 




378 H. Fierz 



the environment. Events are phenomena initiated by the environment. Process events. 



often called discrete events, are caused by the autonomous dynamics of external 
processes. Continuous behaviour of external processes is captured by periodic 
temporal events with associated state values (sampling). Actions are environment 
phenomena caused by reactions of the embedded system. The virtual connection 
transmits a corresponding message whenever an event has occurred or an action must 



be produced. 



physical interaction 



external 

processes 
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sensors 

actuators 



events 
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connection 



actions 



physical 
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units 



embedded 
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Fig. 1. Conceptual embedded system architecture 

The working connection between external processes and generated CIP units 
consists of sensors and actuators connected to system modules called embedded 
connectors. An embedded connector detects events by monitoring sensor phenomena, 
and triggers its CIP unit with corresponding event messages. It receives action 
messages from the CIP unit, and initiates appropriate actuator phenomena to produce 
the corresponding actions in the environment. 

In the door example, the reaction "Opened causes MotorOff" would be produced by 
a CIP component while the monitoring and setting of the interface variables is done by 
a separately constructed embedded connector. 

The complete separation of the functional and the connection problem considerably 
benefits the development process since the two problems can be solved independently. 
This is not just a question of reducing a large problem to two smaller ones, but of 
disentangling two problem complexes belonging to different abstraction levels. 

An important benefit of this problem separation appears for example when control 
functions have to be validated. Because the developed software can rarely be tested 
directly on the target system, it is necessary to use simulation models and specific test 
beds. The CIP approach allows a functional solution to be partitioned in various ways 
and corresponding software components to be generated that can be easily embedded 
within various test environments. 

The proposed problem decomposition has been elaborated by means of the notion 
of problem frames recently introduced by Jackson [10]. The approach allowed us to 
understand more deeply the generic structure of the embedded system problem and its 
relation to the development process. Explicit discussion of the use of problem frames 
for embedded systems is beyond the scope of this paper and will be presented in 
another publication. 
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3. CIP Models 

The CIP meta-model has been defined on the basis of a compositional mathematical 
formalism [11] developed for this purpose. The modelling tool has been designed as a 
component framework based on an object model implementing the CIP meta-model. 
Current research and development extends the modelling framework by tools allowing 
CIP models to be checked against independently defined behavioural properties. 

CIP models are constructed graphically by means of architectural composition, well 
known as a basic paradigm of architecture description languages [12]: communication 
and interaction among state machines is specified by interconnecting these 
components by first-class connectors. A drawback of many synchronous real-time 
languages is that interconnections are described only implicitly, thus proliferating 
complex interaction dependencies. In addition to the architectural configuration, CIP 
requires the control flow among synchronous components to be explicitly modelled to 
prevent cyclic and conflicting chain reactions. Furthermore, behavioural modelling is 
supported by a novel hierarchical structure called master-slave hierarchy. 

Clusters and Processes. A CIP model is composed of a set of asynchronously 
cooperating clusters, each consisting of a number of synchronously cooperating state 
machines termed processes. Formally a cluster represents a state machine with a 
multi-dimensional state space: the state of a cluster is defined by the tuple of its 
process states. Although clusters as well as processes represent parallel behavioural 
entities, their composition semantics is essentially different: clusters model concurrent 
functional blocks of a system, while processes represent orthogonal components of a 
cluster. Hierarchical composition structures based on a simple “part of’ hierarchy 
relation can of course be introduced at both levels. 

Communication - Asynchronous Transmission of Messages. Processes of different 
clusters communicate asynchronously with each other and with the environment by 
means of channels. Communication is specified by a graphical net model (fig. 2) in 
which channels are attached to process ports. Source and sink channels model the 
virtual connection to the environment while internal channels are part of the CIP 
model. 



FlowControl 



aCluster 




aSink 

aSource 



VesselAclions PumpActions 



Fig. 2. Communication net of a CIP model 

Channels model an active communication medium which retains the sequential order 
of transmitted messages. Asynchronous communication in a CIP model means that the 
write and the read action of a message transmission takes place in different cluster 
transitions. Processes represent receptive behavioural entities which must accept 
delivered messages at any time. 
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Interaction - Synchronous Pulse Transmission. The processes of a cluster interact 
synchronously by means of multi-cast pulses. Pulses represent internally transmitted 
events. The straight directed connectors of the interaction net (fig. 3) define the pulse 
flow structure of a cluster. Every connector has an associated partial function termed 
pulse translation which relates outpulses of the sender to inpulses of the receiver 
process. Rhombic connectors declare state inspection (see below). 

A cluster is always activated by a channel message which leads to a state transition 
of the receiving process. By emitting a pulse, the receiving process can activate 
further processes of the cluster, which can in turn activate other processes by pulses. 
The chain reaction resulting from pulse transmission is not interruptible and defines a 
single state transition of the entire cluster. Activated processes can also write 
messages to their output channels. 




Fig. 4. Cascades of a cluster 

The structure of the interaction net does not sufficiently restrict the potential pulse 
transmission chains, as conflicting process activations and cyclic transmission paths 
are in general possible. The problem is well known from statecharts models. To 
ensure deterministic and bounded propagation of interaction, for each process with 
channel input the control flow of chain reactions is restricted by means of a cascade 
(fig. 4). A cascade is a sequential process activation tree compatible with the inte- 
raction net structure. The pulse interaction defined in the processes can be checked 
automatically against the specified cascades. A cascade is activated by a channel input 
for its topmost process. The execution order of a cascade is defined as tree traversal 
from left to right. Cascade branches can be refined into exclusive branches. 

Cascades define functional compositions of deterministic state machines, thus all 
cluster next-state relations are functions. A similar functional approach is known from 
the statecharts variant RSML [13], where global next-state relations are analysed auto- 
matically by investigating functional compositions of atomic next-state relations. 

State Inspection - Static Context Dependency within a Cluster. The conditions of 
a state transition structure of a process are allowed to depend on the states and 
variables of other processes of the same cluster. Read access to the data of a process is 
called state inspection and takes place as in object oriented models via access 
functions termed inquiries. By contrast to pulse transmission, where both transmitter 
and receiver are activated, in the case of state inspection, only the inspecting process 
is active. State inspection gives rise to additional dependencies between processes 
which are declared graphically as rhombic connectors in the interaction net (fig. 3). 
The arrows denote the data flow direction. 
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Processes - Extended Finite State Machines. Processes are modelled as extended 
finite state machines. By means of state transition structures and operations executed 
within transitions, functionality can be specified on two different levels of abstraction. 

The communication interface of a process is defined by one or more inports and 
outports. A port is specified by the set of messages to be received or sent. Each inport 
and outport is connected in the communication net to an incoming or outgoing channel 
respectively. Interaction inputs and outputs on the other hand are defined by two 
distinct sets of inpulses and outpulses. 

PROCESS Door 
IMPORT DoorEvents 
MESSAGES Closed, Opened 
OUTPORT DoorActions 
MESSAGES MotClose, 

MotOpen, MotOff 
IMPULSES close, open 
OUTPULSES cIsAck, opnAck 




Fig. 5. Pure finite state machine 

The transition structure depicted in figure 5 specifies the behaviour of the process 
Door. Process states are represented by circles, transitions by labelled transition 
boxes. The input messages Opened and Closed or the inpulses open and close of the 
process can activate a state transition. An input message for which the current state 
has no outgoing transition causes a context error. An inpulse for which there is no 
transition, on the other hand, is ignored. In each transition an outpulse and a message 
to each outport can be emitted. 

To support data processing and algorithmic concerns CIP processes are specified as 
extended state machines. The extension consists of static process variables, data types 
for messages and pulses, operations and conditions. 

PROCESS Cashier 
VARIABLES amount: int, price: int 
IMPORT EventPort 
MESSAGES Coin: int. Abort 
OUTPORT ActionPort 
MESSAGES Eject: int, OpenSlot 
IMPULSES getMoney: int 
OUTPULSES aborted, paid: int 
OPERATIOMS 

init { SELF . amount = 0; SELF. price = IN;} 
incr {self, amount = SELF, amount + IN;} 

COMDITIOMS 

notEnough (SELF. amount < SELF. price) 




COMDITIOM ALLOCATIOM 
2 notEnough, 3 ELSE_ 
OPERATIOM ALLOCATIOM 
1 init, 2 incr, 3 change, report, 4 ... 



Fig. 6. Extended finite state machine 
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The Coin message of the process Cashier (fig. 6) for instance carries an integer value 
representing the value of the inserted coin. Operations allocated to transitions are used 
to update the inserted amount and to calculate the change. When the process is in the 
cashing state - shown in grey - there are two potential transitions for the input message 
Coin. Associated conditions render the process behaviour deterministic. Such 
conditions can depend on the input data and the values of the local process variables, 
but also, by state inspection, on the states and variables of other processes of the same 
cluster. 

Variables, data types, operations and conditions are formulated in the programming 
language of the generated code. From the high level modelling point of view these 
constructs represent primitives which add computational power to the pure models. 
The specified code constructs are incorporated inline in the generated code. From a 
theoretical point of view it would be more elegant to use a functional specification 
language, but in practice the value of this pragmatic approach based on the 
implementation language has been clearly confirmed: it permits easy use of functions, 
data types and object classes from existing libraries. 

The Door and Cashier processes presented react to event messages indicating 
discrete state changes of an external process. Control of continuous processes on the 
other hand is based on periodic sampling of the continuous process states. Figure 7 
shows the transition structure of a process regulating the temperature of a liquid by 
means of a heater with continuously variable heating power. The process reacts to the 
periodically occurring Sample message giving the sampled liquid temperature. Control 
algorithms are allocated to transitions. Feedback control is performed by means of the 
data carried by the SetHeater output message. 




Fig. 7. Time driven state machine 

Master-Slave Hierarchies - Behavioural Structuring. The state transition structure 
of a CIP process specifies how the process must react to inputs by state changes and 
generated outputs. Often such a behaviour becomes quite complex due to inputs which 
must influence the full future behaviour of the process. An alarm message, for 
instance, must lead to behaviour different from the normal case until the alarm is 
reset. The resulting transition structure will then represent a kind of superposition of 
the structures for the normal and the alarm case. 

To disentangle such implicit superpositions the full behaviour of a process is 
modelled by a number of alternative modes. Each mode is specified graphically by a 
state transition diagram, based on the states, ports and pulses of the process. 
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MODE normal MODE shutting 




Fig. 8. Two modes of the Door process 

Figure 8 shows an elaboration of the Door process (fig. 5) by an additional shutting 
mode. This mode describes an alternative behaviour of the process, usable when an 
alarm or error condition occurs. 

The mode changes of a process can be induced by one or more processes 
designated as master. The master-slave relation of a cluster is specified by a master- 
slave graph (fig. 9). Master-slave connections are represented by triangles which are 
connected at the bottom angle to a slave and at the top side to one or more masters. 
The graph is restricted to be acyclic in order to define a hierarchical structure. 




+ 






Fig. 9. Master-slave hierarchy graph 

The levels indicated in the graph of figure 9 have no formal meaning. Levelling is 
merely used informally to group processes interacting on a common level of 
abstraction. Usually not all processes of a cluster are involved in the hierarchical 
structure. 

The behavioural semantics of master-slave relations is defined as follows: 

The active mode of a slave is determined by the current states of its masters. 

The association of master states and slave modes is specified by a corresponding mode 
setting table which defines a total function from the Cartesian product of master states 
to the modes of the slave. 

Thus a mode change of a slave can occur whenever one of its masters changes its 
state. Even when the Door process of figure 8 is in the closed state in the normal 
mode, a master can induce a switch to the shutting mode. The effect is simply that the 
door will remain closed when an open pulse is sent to the process. If for functional 
reasons a mode change should not occur in certain states, it must be prevented by 
means of explicitly modelled interaction between slave and master. 

A slave can itself initiate a mode change by sending a pulse to one of its masters. 
This pattern is used typically when an error is recognised by a slave and its master 
must then be triggered to induce a change to an error mode in other slaves as well. 
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It is important to note that a change of mode does not affect the current state of a 
slave, which can change only when a transition in the active mode is triggered by an 
input. The rule reflects the fact that a change of mode does not alter the history of 
basic interactions. 

CIP modes differ essentially from the well-known notion of superstates or serial 
modes of hierarchical state machine models [5, 14]. The modes of a CIP process are 
defined on the same set of states, while superstates describe exclusive behaviours 
based on disjoint sets of states. Thus the history expressed by the current states of the 
lower levels cannot be retained when the pertaining superstate is changed. 

A further difference to hierarchical state machines lies in the nature of the 
hierarchy relation. A master-slave hierarchy relation is a set of associated state 
machines, whereas hierarchical state machines represent compositions based on nested 
state sets. 

Process Arrays - Static Replication of Processes. Replicated processes are 
modelled as multidimensional process arrays. The multiplicities of the singular array 
dimensions are defined by abstract index types. Using common index types for dif- 
ferent process arrays allows modelling of finite relations among process arrays, 
usually expressed by means of entity-relationship diagrams. 



4. Domain-Oriented Development - The TCS Case Study 

An operational behavioural model provides a well defined level of abstraction. In 
addition to these "guard rails", the CIP method adopts the concept of environment- 
oriented behavioural modelling from the ISD method. This concept bases the 
development of control systems on a realised model of the environment inside the 
system in order to capture the essential behaviour of the processes to be controlled. 
CIP models are therefore constructed in a sequence of three steps: 

1 . Specification of the virtual real world interface 

2. Establishing a behavioural context model 

3. Construction of control functions 



Environment 




1 . Events & Actions 



Environment CIP Model 




2. Context Model 



Environment CIP Model 




3. Complete Model 



Fig. 10. Development steps 



The virtual real word interface is specified by collections of events and actions used to 
bring about the required functional behaviour. The context model is the first part of 
the CIP model to be constructed. It consists of source and sink channels attached to 
CIP processes that consume event messages and produce action messages. The 
transition structures of these processes represent protocols of the virtual 
communication with the environment. In the third step the CIP model is completed by 
creating function processes which interact and communicate with the processes of the 
established context model. 
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4.1 Problem Statement of the TCS Case Study 

The TCS (TransportControlSystem) example illustrates how a CIP model is developed 
and how a simple master-slave hierarchy works. The resulting cluster represents an 
executable solution of the stated problem. The algorithmic requirements are trivial, so 
the model consists of pure state machines only. 

Plant description. The plant to be 
controlled comprises a conveyor moving 
objects in one direction, a scanner ser- 
ving to identify loaded objects and a 
switch allowing the operator to enable 
and disable object scanning. A loading 
sensor at the front end of the conveyor 
detects loaded objects. 

Requirements. The system starts its function when the first object is loaded. When 
the conveyor is not loaded for 30 seconds, its motor is to be turned off. Two modes of 
operation are required. If the switch is set off, the started conveyor moves conti- 
nuously and the scanner is not activated. If the switch is set on, the conveyor must stop 
when an object is loaded, the scanner is activated, and the conveyor starts moving 
again when the scanner indicates that scanning is complete. 

4.2 Virtual Real World Interface: Events and Actions 

In the first step a virtual interface to the environment is specified by identifying events 
to be detected and actions to be produced by the embedded system. Events and actions 
designate instantaneously occurring subject-matter phenomena of the environment. 

TCS - Virtual Real World Interface 



Event List 



On / Off 


the switch is set on / off 


Load / Free 


an object is loaded / is moved away from the loading place 


Scanned 


scanning is completed 



Action List 



MotOn / MotOff 


turning the conveyor motor on / off 


Scan 


activating the scanner 



Scanner Switch [cTil 



Loading Sensor 




Conveyor 



Fig. 11. TCS plant 



Fig. 12. Event and action lists 



4.3 Behavioural Context Model: Channels and Interface Processes 

The purpose of the context model is to establish the interface processes of the CIP 
model and to connect them virtually to the environment. The interface processes are 
deduced from descriptions and process models of the environment. They receive event 
messages and produce action messages through appropriately specified source and 
sink channels. The state transition structures of the interface processes describe the 
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valid sequences of received event and produced action messages. Thus the context 
model formally describes the behaviour of the individual external processes, seen 
from the CIP model. 

The channels of the context model represent a virtual connection to the 
environment. The event and action messages of these channels must correspond to the 
events and actions of the virtual real world interface. These channels are also used as 
the interface model for the CIP components to be generated later on as described in 
section 5. 



TCS - Context Model 



COMMUNICATION NET InterfaceChannels 



TransportCluster 




CHANNEL SwitchEvt MESSAGES Off, On 
CHANNEL ScanEvt MESSAGES Scanned 
CHANNEL ScanAct MESSAGES Scan 
CHANNEL ConvEvt MESSAGES Free, Load 
CHANNEL ConvAct MESSAGES MotOff, MotOn 



PROCESS Switch 



PROCESS Scanner 




PROCESS Conveyor 
MODE ongoing 



loadedStopped loadedMoving 




MODE stepped 




Fig. 13. Context model specification 

The incomplete transition structures of the preliminary independent interface 
processes must be completed in the control function step. Modelling interface 
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processes means understanding the behaviour of the external processes; but it also 
means anticipating the way they will be controlled when the system is completed. The 
interface behaviour of the Conveyor process, for example, has already been defined by 
the two modes ongoing and stepped, corresponding to the modes of operation of the 
required system function. 



4.4 Construction of Control Functions 



The interface processes are grouped into asynchronous clusters. The partition into 
clusters determines how the system can be implemented later on by concurrently 
running CIP components. A possible reason to refine the initial clustering is 
modularisation: because of their asynchronous behaviour, clusters represent very 
weakly coupled functional blocks, well suited to development and validation by 
different members of the development team. 

To bring about the required behaviour of the environment, function processes are 
created and connected appropriately with the established interface processes. The 
primary functionality of the system is first developed, based on the normal behaviour 
defined by the model processes. To permit reaction to unexpected events, the interface 
processes concerned must usually be extended by error modes; also, additional 
supervisor and error handling processes must be introduced. 

TCS - Complete Model 

The CIP model of the simple case study consists of one cluster only. The function 
process Controller controls the cooperation of the Conveyor and the Scanner 
processes. The required modes of operation are modelled by corresponding modes of 
the Controller. The current state of the Switch master process determines the active 
mode of the Controller and the Conveyor process. The ongoing mode of the Conveyor 
interface process has been extended by transition 2; the transition is necessary because 
the switch can be set off even when the conveyor is stopped for scanning. The 
Controller process uses a timer supported by the CIP Tool; provision of such timers is 
a form of ‘modelling sugar’. Because the TIMEUP_ trigger represents an external 
cluster input, a Controller cascade is also needed. 



Remark. The corner marks of a process box indicate externai input or output respectively: top 
right: channel input, top left: timer input, bottom right: channel output. 



CLUSTER TransportControl 
INTERACTION NET 

Switch 

I X 



CASCADES 



PULSE TRANSLATIONS 



Controller.move 
Controller.stop 
Controller.scan 
Conveyor, loaded 
Scanner.scanned 



-> Conveyor.move 
-> Conveyor.stop 
-> Scanner.scan 
-> Controller.loaded 
-> Controller.scanned 



MASTER-SLAVE GRAPH 





388 H. Fierz 




PROCESS Conveyor 



MODE SETTING Switch. off -> ongoing, Switch. on -> stepped 
MODE ongoing MODE stepped 




PROCESS Controller 

MODE SETTING Switch. off -> freeLoading, Switch. on -> scannedLoading 



MODE freeLoading 



MODE scannedLoading 





Legend: 




stop timer 



Fig. 14. Complete model specification 
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The model can be animated by entering event messages or the TIMEUP_ trigger. The 
following trace describes five cluster transitions of a particular animation: 



PROCESS 


MODE 


PRESTATE 


INPUT 


POSTSTATE 


OUTPUT 


Conveyor 


ongoing 


Idle 


Load 


loadedMovIng 


loaded, MotOn 


Controller 


freeLoadIng 


Idle 


loaded 


moving 


T 



Switch 


- 


off 


On 


on 




Conveyor 

Controller 


ongoing POSTMODE stepped 

freeLoadIng POSTMODE scannedLoadIng 



Conveyor 


stepped 


loadedMovIng 


Free 


moving 






Conveyor 


stepped 


moving 


Load 


loadedStoppe 

d 


loaded, MotOff 


Controller 


scannedLoadIng 


moving 


loaded 


scanning 


scan, S 


Scanner 


- 


ready 


scan 


scanning 


Scan 



Scanner 


- 


scanning 


Scanned 


ready 


scanned 


Controller 


scannedLoadIng 


scanning 


scanned 


moving 


move, T 


Conveyor 


stepped 


loadedStoppe 

d 


move 


loadedMovIng 


MotCn 



Fig. 15. Animation trace: five cluster transitions 

5.5 More about Events and Actions 

The elaboration of event and action collections represents a crucial development step 
because it determines the level of abstraction used to solve the functional problem. On 
the one hand, the collected events and actions must suffice to bring about the required 
behaviour of the environment. On the other hand, the feasibility of the connection 
must be ensured by verifying that all events can be recognised and all actions can be 
produced by means of the available interface devices. 

Events and actions can have attributes which are transmitted as data of the 
corresponding channel messages. An event attribute describes a circumstance of an 
occurring event: for example, the bar code read by a scanner or the parameters of an 
operator command. An action attribute describes a circumstance to be brought about 
when the action is performed, such as the position of an opened valve. 

Events are classified into process events and temporal events. Process events are 
caused by the autonomous dynamics of external processes, while temporal events 
occur at prescribed points in time. In general, a process event is related to a discrete 
change of the process. Discrete states often denote a whole range of external process 
states, thus representing abstractions essential for the required functional behaviour. 
Examples are the level ranges of a liquid in a vessel, or set of ready states of a 
complex device. Continuous states must be monitored by sampling events that capture 
the behaviour of continuous processes. Sampling events are periodically occurring 
temporal events whose attributes are the sampled process states. Similarly, the 
embedded system influences continuous processes by repeated production of 
attributed actions. 
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5. Component-Based Implementation of CIP Models 

For implementation of a CIP model the set of clusters is partitioned into CIP units. 
Each unit can be transformed automatically into a software component, executable in 
a concurrent thread of the implemented system. The code for a CIP unit consists of a 
CIP shell and a CIP machine, and is produced in two individual generation steps. 



generated 

code 



generated 

code 



user written 
code 



low level 
drivers 



environment 
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external 
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processes 





internal 

communication 



Fig. 16. Implementation of a CIP unit 



The CIP shell represents the interface of the CIP unit: it consists of two linear 
structures of function pointers, one for the incoming and one for the outgoing 
channels. The CIP shell code is generated from the channel specifications only, and so 
is independent of the modelled clusters. 

The CIP machine is a passive object implementing the reactive behaviour of the 
CIP unit; it is activated by channel function calls through the input shell. Every call 
triggers a cluster transition from which channel functions are called through the output 
shell. 

Partitioning the model creates additional connection problems due to the channels 
interconnecting the CIP units. Thus the task of constructing an active embedded 
connector is twofold. On the one hand, a subset of the controlled processes must be 
connected to the CIP machine: these connections correspond to the source and sink 
channels of the CIP unit. The implementation of this part of the connection demands 
tools and techniques suited to the technology of the interface devices. Communication 
between CIP units, on the other hand, can usually be implemented by means of 
standard transmission techniques based on field bus systems or serial connections. 

Because of the cooperation of parallel entities modelled within CIP models, there is 
no need to implement conceptual parallelism by means of multi-tasking. Only one or 
very few CIP units are usually implemented on the same processor. Task scheduling 
and interrupt handling are thus reduced to hardware interface functions and 
background services [15]. 

The environment-oriented development process of the CIP method allows the 
various CIP shells to be modelled at an early stage. The tool supports locking of these 
interface specifications for extended periods. The generated shell code serves as semi- 
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rigid joints among CIP machines and embedded connectors. Thus, once a CIP shell is 
defined, the associated CIP machine model and its embedded connector can be 
developed concurrently. The concept has been proven in a number of industrial and 
academic projects, where different partitions of the same CIP model had to be 
connected to simulation models, to test beds for specific system parts, and to the real 
environment of the final target system. In the development of a hybrid car, even the 
code for the connector has been generated from formal connector descriptions [16]. 

6. Summary 

The CIP method is tailored to control problems typically encountered in the 
development of embedded systems. Identifying the characteristic difficulties of this 
problem class, the method offers suitable development concepts and modelling 
techniques to promote system development based on engineering activities. 

A central difficulty in the development of embedded systems results from the need 
to monitor and control real world phenomena by means of specific interface devices. 
The general embedded system problem is therefore decomposed into a functional 
problem to be solved by means of formal behavioural models, and a connection 
problem demanding development techniques adapted to the technology of the 
interface devices. By stabilising dependencies at an early stage, the resulting 
development process allows the two problems to be solved independently. 

Most reactive behavioural models support either synchronous or asynchronous 
cooperation of behavioural entities. In order to support internal synchronous 
interaction propagation as well as flexible distributed implementation of system parts, 
the CIP method combines both cooperation paradigms within the same model. CIP 
models consist of asynchronous clusters of processes that are synchronously 
cooperating extended state machines. 

In order to make interaction dependencies among processes explicit CIP models are 
constructed by means of architectural composition. However, the problem of 
conflicting and unbounded internal interaction can be solved only by restricting the set 
of possible interaction paths. CIP therefore requires the control flow of interaction to 
be specified by process cascades, resulting in deterministic behavioural models with 
bounded response times. 

Behavioural structuring is supported by a novel hierarchical structure: the master- 
slave hierarchy. The hierarchical structure is based on master processes which induce 
high level behaviour changes in designated slave processes. The problem-oriented 
hierarchy relation is well suited to express powerful behavioural abstractions. This 
concept has proved much more flexible than rigid nesting of transition structures. 

Control functions of embedded systems must maintain an ongoing behavioural 
relationship with the controlled external processes. The model construction process 
therefore starts by developing a behavioural context model that defines the legal 
histories of external interactions. The full functional solution is constructed in further 
development steps where function processes are added and connected to the context 
model. 

The integration of the CIP code is based on component technology. Various 
configurations of generated software components can be connected to interface 
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modules and components supporting other concerns such as validation and simulation. 
The resulting flexibility in building executable system parts becomes crucial when 
developed systems have to be validated in various test environments. 
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Abstract. The ability of reconfiguring software architectures in order 
to adapt them to new requirements or a changing environment has been 
of growing interest, but there is still not much formal work in the area. 
Most existing approaches deal with run-time changes in a deficient way. 
The language to express computations is often at a very low level of spec- 
ification, and the integration of two different formalisms for the compu- 
tations and reconfigurations require sometimes substantial changes. To 
address these problems, we propose a uniform algebraic approach with 
the following characteristics. 

1. Components are written in a high-level program design language 
with the usual notion of state. 

2. The approach combines two existing frameworks — one to specify ar- 
chitectures, the other to rewrite labelled graphs — ^just through small 
additions to either of them. 

3. It deals with certain typical problems such as guaranteeing that new 
components are introduced in the correct state (possibly transferred 
from the old components they replace). 

4. It shows the relationships between reconfigurations and computa- 
tions while keeping them separate, because the approach provides a 
semantics to a given architecture through the algebraic construction 
of an equivalent program, whose computations can be mirrored at 
the architectural level. 



1 Introduction 

1.1 Motivation 

One of the topics which is raising increased interest in the Software Architecture 
(SA) community is the ability to specify how a SA evolves over time, in particular 
at run-time, in order to adapt to new requirements or new environments, to 
failures, and to mobility [28,6,36]. There are several issues at stake [27], among 
them: 

modification time and source Architectures may change before execution, 
or at run-time (called dynamic reconfiguration). Run-time changes may be 
triggered by the current state or topology of the system (called programmed 
reconfiguration [7]). 
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modification operations The four fundamental operations are addition and 
removal of components and connections. Although their names vary, those 
operators are provided by most reeonfiguration languages (like [16,7,22,1]). 
In programmed reconfiguration, the changes to perform are given with the 
initial architecture, but they may be executed when the architecture has 
already changed. Therefore it is necessary to query at run-time the state of 
the components and the topology of the architecture, 
system state Reconfiguration must cause the least possible disruption [32] and 
the new system must be in a consistent state. 



1.2 Related Work 

Only few ADLs are able to express dynamism [23]. Darwin [20] only permits 
constrained dynamism: the initial architecture may depend on some parameters, 
and during run-time components may be replicated. Wright’s formalism is a bit 
cumbersome since it requires all distinct configurations to be uniquely tagged 
[1]. acme’s proposal only allows for the specification of optional elements (i.e., 
components, connectors, and links) ]26]. 

There has been also formal work in the area. Some of it uses two differ- 
ent formalisms to represent reconfigurations and computations [24,2,1,17] while 
other approaches arc uniform [33,31,12,15], using some notion of rewriting. These 
works have some common drawbacks. 

First, the languages to represent computations are very simple and at a low- 
level: rewriting of labels [15], process calculi [24,17,2,1], term rewriting [33,12], 
graph rewriting [31]. They do not capture some of the abstractions used by 
programmers and often lead to cumbersome specifications. 

Second, the combination of reconfigurations and computations leads to ad- 
ditional, sometimes complex, formal constructs: [15] uses constraint solving, 
[24,1,2] define new semantics or language constructs for the process calculi, [12] 
must dynamically change the rewriting strategics, [31] imposes many constraints 
on the form of graph rewrite riiles because they are used to express computation, 
communication, and reconfiguration. 

A result of these shortcomings is that reconfigurations are constrained ([15] 
uses only context-free rules and [1] does not handle systems with potentially infi- 
nite distinct configurations) or that the relationship between the reconfigurations 
and computations is complex [1,24] or not quite apparent [2,17]. 



1.3 Approach 

To overcome some of these disadvantages, we propose to use a uniform algebraic 
framework and a program design language with explicit state. The former allows 
us to represent both architectures and their reconfigurations, and to explicitly 
relate the computational with the architectural level in a direct and simple way. 
On the other hand, the language incorporates some of the usual programming 
constructs while keeping a simple syntax to be formally tractable. 
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To be more precise, components are written in a UNiTY-like language [11], 
and consist of a set of “if-then” commands with assignments to typed attributes. 
Attributes have explicit initialisation conditions and the state of a component is 
given as usual by a set of pairs attribute/ value. 

The algebraic framework is Category Theory [25], which has several benefits 
for SA [10]. Among them, it has a graphical notation, it is independent of the 
language used to represent components thus being able to relate different lan- 
guages [9] , and it allows the formalisation of connectors and their construction 
[8,34,35]. 

Like [24,33,31,15], we represent architectures by labelled graphs showing the 
components and their interconnections. Additionally, the categorical framework 
provides a semantics, given by an operation that transforms the architectural 
diagram into an equivalent component representing the whole system on which 
computations arc performed. This relates the architectural and computational 
levels. 

Rcconhguration is specihed through algebraic graph rewriting [5], a formal- 
ism with many years of research into its theory and application. Together with 
the representation of components, the approach guarantees that components are 
removed in a quiescent state [16] (i.e., when not interacting with other compo- 
nents) and are introduced in a properly initialized state. 

To make the paper self-contained and accessible to a wider audience, it is 
written in an informal way, using mathematical definitions sparingly. Also, all 
needed algebraic notions are briefly introduced. To facilitate exposition, the sec- 
ond part deals with reconfigurations performed when the system is shut down, 
while the third part incorporates the notion of state, in order to deal with re- 
configurations that have to be coordinated with on-going computations. 

1.4 The Example 

The example is inspired in the luggage distribution system used to illustrate 
Mobile Unity [29]. One or more carts move on a N units long track with the 
shape 




A cart advances one unit at each step in the direction shown by the arrows. 
The i-th cart starts from a unit determined by an injective function start of 
i. Carts are continuously moving around the circuit. Their movement must be 
synchronised in such a way that no collisions occur at the crossing. We assume 
that track units 7 and 28 cross. Reconfigurations may be due not only to mobility 
but also to component upgrade: a cart may be replaced by one with a builtin 
lap counter. 

1.5 Category Theory 

Category Theory [25] is the mathematical discipline that studies, in a general and 
abstract way, relationships between arbitrary entities. A category is a collection 
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of objects together with a collection of morphisms between pairs of objects. A 
inorphisin / with source object a and target object h is written f : a ^ h or 

/ 

a — ^ b. Morphisms come equipped with a composition operator “o” such that 
if / : a — > 6 and g : b ^ c then g o f \ a c. Composition is associative and has 
identities ida for every object a. 

Diagrams are directed graphs — where nodes denote objects and arcs repre- 
sent morphisms — and can be used to represent “complex” objects as configura- 
tions of smaller ones. For categories that are well behaved, each configuration 
denotes an object that can be retrieved through an operation on the diagram 
called colimit. Informally, the colimit of a diagram returns the “minimal” object 
such that there is a morphism from every object in the diagram to it (i.e., the 
colimit contains the objects in the diagram as components) and the addition of 
these morphisms to the original configuration results in a commutative diagram 
(i.e., interconnections, as established by the morphisms of the configuration di- 
agram, are enforced). 



Pushouts are coliinits of diagrams of the form b a c. By definition of 
colimit, the pushout returns an object d such that the diagram 

a 




exists and commutes (i.e., ho f = i = j og). Furthermore, for any other pushout 
candidate d , there is a unique morphism k : d ^ d . This ensures that d, being 
a component of any other object in the same conditions, is minimal. Object c is 
called the pushout complement of diagram a b d. 



1.6 Graph Transformation 

The algebraic approach to graph transformation [5] was introduced over 20 years 
ago in order to generalize grammars from strings to graphs. Hence it was neces- 
sary to adapt string concatenation to graphs. The approach is algebraic because 
the gluing of graphs is done by a pushout in an appropriate category. There are 
two main variants, the double-pushout approach [5] and the single-pushout ap- 
proach [19]. We first present the former. It is based on a category whose objects 
are labelled graphs and whose morphisms f : a ^ b are total maps (from a’s 
nodes and arcs to those of b) that preserve the labels and the structure of a. 

A graph transformation rule, called graph production, is simply a diagram of 

I y V 

the form L < — K — >■ B. where L is the left-hand side graph, R the right-hand 
side graph, K the interface graph and I and r are injective graph morphisms. 
The rule states how graph L is transformed into i?, where K is the common 
subgraph, i.e., those nodes and arcs that are not deleted by the rule. As an 
example, the rule 
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a • 1 



a • 3 



ten 1 



3-s-h 2 



0*1 



0*2 






2i-^3 



9 



a • 3 



substitutes an arc by another. Graphs are written within dotted boxes to improve 
readability. Nodes and arcs are numbered uniquely within each graph to show 
the mapping done by the morphisms. 

A production p can be applied to a graph G if the left-hand side can be 
matched to G, i.c., if there is a graph morphism rn : L ^ G. A direct derivation 
from G to H using p and rn exists if the diagram 



L<r 

I m 



I 






K—^R 

d m* 

r* 

D-^H 



can be constructed, where each square is a pushout. Intuitively, first the pushout 
complement D is obtained by deleting from G all nodes and arcs that appear 
in L but not in K. Then H is obtained by adding to D all nodes and arcs that 
appear in R but not in K. The fact that I and r are injective guarantees that H 
is unique. An example derivation using the previously given production is 



0*1 
/ 2 
o • 3 





; a • 1 




a • 1 ; 


1-S-. 1 




ll-^l 


^ n 


2 ; 


3-en 2 




2i-^3 


^ y 




; a • 2 ; 




a • 3 ; 



2 i -^-2 



3i->l 






2i-^l 



2 i -^-2 



3m-1 



a • 1 



a • 1 



a • 1 



/2 



b»4 



l-s-H 1,4 -!-h 2 
3<-h3 



6.2 



1<-H 1,4<-h 2 
3-s-h 3 



g2 T|3 

6*4 



A direct derivation is only possible if the match m obeys two conditions. First, 
if the production removes a node n G T, then each arc incident to m(n) G G 
must be image of some arc attached to n. Second, if the production removes one 
node (or arc) and maintains another one, then m may not map them to the same 
node (or arc) in G. Both conditions are quite intuitive. The first one prevents 
dangling arcs, the second one avoids contradictory situations. Both allow an 
unambiguous prediction of removals. A node of G will be removed only if its 
context (i.e., adjacent arcs and nodes) are completely matched by the left-hand 
side of some production. The advantage is that the production specifier can 
control exactly in which contexts a node is to be deleted. This means it is not 
possible to remove a node no matter what other nodes are linked to it. 

The single-pushout approach is simpler. Graph morphisms are partial maps, 
productions are simply morphisms L — >■ R, and there is no restriction on m. 
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Because of that, derivations may have unintuitive side-effects. Moreover, the 
approach allows the removal of nodes in unknown contexts. We feel that for 
dynamic architecture reconfiguration it is preferable to allow the designer to 
control precisely in which situations a component (i.e., node) may be removed. 
For these reasons, we adopt the double-pushout approach. 



2 Community 

Community [11] is a program design language based on unity [3] and IP [13]. 
In this paper we only consider a subset of the full language [18]. We also assume 
a fixed algebraic data type specification [25]. We do not present the specification 
of the types and predefined functions used in this paper. 



2.1 Programs 

For our purposes, a Community program consists of a set of typed attributes, 
a boolean expression to be satisfied by the initial values of the attributes, and a 
set of actions, each of the form name: [guard — J- assignment's)]. The empty set 
of assignments is denoted by skip. At each step, one of the actions is selected 
and, if its guard is true, its assignments are executed simultaneously. 

To be more precise, syntactically a program has the form 
prog P 
write W 
read R 
init I 

do oi: [gi wu := expn ]] W 21 := exp 2 i ]] .. .] 

D 02: [52 ^ W12 expi 2 11 . . .] 

[] ... 

where R are external attributes (i.e., the program may not change their values), 
W arc the local attributes, I is the initialisation condition on VF , a* are the 
actions with boolean expressions gi over IF UP , Wij G W , and expij expressions 
over IF U P . The values of the external attributes are given by the environment 
and may change at each step. 

The following program describes the behaviour of the t-th cart, as introduced 
in section 1.4. 

prog Cartj 
write 1 : int 
init 1 = start{i) 

do move: [true ^ 1 := (1 + 1) mod N] 

Henceforth we abbreviate “(A -I- 1) mod N” as “A 1” and omit the action 
guards when they are “true” . 
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2.2 Superposition 



A morphism from a program P to a program P states that P is a component of 
the system P and, as shown in [11], captures the notion of program superposition 
[3,13]. Mathematically speaking, the morphism maps each attribute of P into an 
attribute of P of the same type, and such that local attributes of the component 
P are mapped to local attributes of the system P , and it maps each action name 
a of P into a (possible empty) set of action names . . . , o„} of P [34]. Those 
actions correspond to the different possible behaviours of a within the system 
P . These different behaviours usually result from synchronisations between a 
and other actions of other components of P . Thus each action must preserve 
the functionality of a, possibly adding more things specific to other components 
of P . In particular, the guard of must not be weaker than the guard of a, 
and the assignments of a must be contained in a^. 

Putting it in a succinct way, a morphism f : P ^ P maps the vocabulary 
of P into the vocabulary of P , and thus any expression e over the attributes of 
P can be automatically translated into an expression /(e) over the attributes of 
P . Thus action a: [g ^ w := exp] of P is mapped into a set of actions ap. [g^ 

/(w) := f{exp) |] . . . ] of P with each g^ implying f{g). 

It is easy to see that the “component-of” relationship, established by the map- 
pings of attributes and actions as described, is reflexive and transitive. Therefore, 
programs and superposition morphisms constitute a category. 

Continuing the example, the following diagram shows in which way program 
“Cartj” can be superposed with a counter that checks how often the cart passes 
by its start position. Notice how the second program strengthens the initialisa- 
tion condition and it divides action “move” in two sub-cases, each satisfying the 
condition given above. 



prog Cart, . . . 



rnove\-^{mov€,passy 



l\-^location 



prog CartWithLapSj 

write location, laps : int 

init location = start(i) A laps = -1 

do pass: [location = start{i) — t location := location -l-jv 1 || laps laps -|- 1] 
[] move: [location ^ start(i) location := location -l-jv 1] 



2.3 Configurations 

Interactions between programs are established through action synchronisation 
and memory sharing. This is achieved by relating the relevant action and at- 
tribute names of the interacting programs. 

In Category Theory, all relationships between objects must be made explicit 
through morphisms. In the particular case of COMMUNITY programs, it means 
for example that two attributes (or actions) of two unrelated programs are dif- 
ferent, even if they have the same name. To state that attribute (or action) 
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a\ of program Pi is the same as attribute (resp. action) 02 of P 2 one needs a 
third, ‘hnediating” program C — the channel — containing just an attribute (resp. 
action) a and two morphisms a i : C ^ Pi that map a to a*. 

In general, a channel contains the features that are shared between the two 
programs it is linked to, thus establishing a symmetrical and partial relationship 
between the vocabularies of those programs. To be more precise, a channel is 
just a degenerate program that provides the basic interaction mechanisms (syn- 
chronisation and memory sharing) between two given programs and thus adds 
no attributes or computations of its own. Thus a channel is always of the form 

prog P 
read R 

do [] o : [true ^ skip] 
aeA 



We use the abbreviated notation {R\A). 

Even through the disciplined use of channels to establish interactions, prob- 
lems arise if two synchronised actions update a shared attribute in distinct ways. 
As actions only change the values of local attributes, it is sufficient to impose 
that local attributes are not shared, neither directly through a single channel 
nor indirectly through a sequence of channels. This restriction forces interac- 
tions between programs to be synchronous communication of values (with local 
attributes acting as output ports and external attributes as input ports), a very 
general mode of interaction that is suitable for the modular development of 
reusable components, as needed for architectural design. 

We call diagrams that satisfy these conditions configurations. Moreover, in 
order to cater for architectures obtained through the application of connectors 
(Section 2.4) and through the replacement of components by more specialized 
ones (Section 2.5), in a configuration each program is either isolated or connected 
through n channels to exactly n other programs or a specialisation of one or more 
programs, like in this diagram: 

Pi P2 <— P3 <— Pi Cl ^ P5 ^ Pq P7 <— C2 ^ Ps 

We take advantage of the graphical nature of diagrams to present a simple and 
intuitive formal definition that has a straightforward efficient implementation. 

First we define the data view of a diagram of programs and superposition 
morphisms as directed graph with one node {Pi, aij) for each attribute fly of each 
program Pi occurring in the diagram, and an arc from node {Pi, aij) to {Pk,aki) 
if there is a morphism in the diagram mapping Oy into au ■ A diagram is well- 
formed if in its data view, whenever there is a undirected path between two 
nodes corresponding to local attributes, there is a directed path between them. 
This represents formally the intuitive idea that two programs may “share” local 
attributes only if one is a sub-program of the other. As stated in Section 1.5, 
a diagram is a directed (labelled) graph, and thus it is possible to compute for 
each vertex the number of incoming and outgoing edges, called the indegree and 
outdegree, respectively. A configuration is then a well-formed diagram where 
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each node is a channel with indegree zero and outdegree two or a program with 
outdegree zero (like P\ and Pq in the above conhguration example) or one. 

It is efficient to check whether a given diagram is a configuration. Notice that, 
due to the restriction on the outdegree of programs, in the data view each local 
attribute has at most one outgoing arc, and thus it is easy to see whether there is 
a directed path between two of them. Moreover, the find-union algorithm [30] can 
compute efficiently the connectivity of attributes, since the actual non-directed 
paths are irrelevant. 

It can be proved that every finite configuration has a colimit, which, by defi- 
nition, returns the minimal program that contains all programs in the diagram. 
Since the proof is constructive, the configuration can be “compiled” into a single 
program that simulates the execution of the overall system. More precisely, the 
colimit is obtained by taking the disjoint union of the attributes (modulo shared 
attributes), the disjoint union of aetions (modulo synchronized ones), and the 
conjunction of the initialisation conditions. Actions are synchronized by taking 
the conjunction of the guards and the parallel composition of assignments. An 
example is provided in the next section. 



2.4 Architectures 

A n-ary connector consists of n roles Ri and one glue G stating the interaction 
between the roles. These act as “formal parameters”, restricting which compo- 
nents may be linked together through the connector. Thus, the roles may contain 
attributes and actions which are not used for the interaction specification. Like- 
wise, the glue may contain attributes and actions that are not visible to the roles. 
Hence, glue and role only share part of the vocabulary. Applying the notion of 
channel to connectors [8], for each role Ri there must be a channel Ci together 
with morphisms j i : Ci ^ G and p i : Ci ^ Ri stating which attributes and 
actions of Ri are used in the interaction specification, i.e., the glue. A connector 
can be seen as an extension of a channel, namely one for more complex interac- 
tions that require additional computations. Therefore a connector is represented 
by a very specific kind of configuration. 

Returning to our example, assume that two carts are approaching the crossing 
and one of them is nearer to it. To avoid a collision it is sufficient to force the 
nearest cart to move whenever the most distant one does. That can be achieved 
using an action subsumption connector. Action a subsumes action 6 if 6 executes 
whenever a does. This can be seen as a partial synchronisation mechanism: a is 
synchronised with b, but b can still execute freely. The connector that establishes 
this form of interaction is given by glue “G” and roles “Far” and “Near” of Figure 
1 . 

Notice that although the two roles are isomorphic, the binary connector is 
not symmetric because the glue treats the two actions differently. This is clearly 
indicated in the glue: “b” may be executed alone at any time, while “a” must 
co-occur with “b” if the interaction is taking place. Hence, action “a” is the one 
that we want to connect to the “move” action of the cart that is further away 
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(|a> 

prog Far 
write fl : int 



do move: 

fi^l 



fl:=fl +JV 1] 



move^move 



^ l^fl 

prog Carts . . . ¥ 



prog G 

> do ab: [skip] e 
[] b: [skip] 



prog Carts 
write fl, nl : int 
init fl = start(i) A 
nl = start(b) 
do ab: [fl:=fl +iv 1 

II nl:=nl +jv 1] 

[] b: [nl:=nl +jv 1] 



{a6,6}-«— I 6 



(| 6 > 



b>-^move 



i 



prog Near 

write nl : int 

do move: [nl:=nl +jv 1] 



move^move 



^ nl^ — 1 1 

{a6,6}-«— I move 



prog Carts . . 



Fig. 1. An applied action subsumption connector and its colimit 



from the crossing, while action “b” is associated to the movement of the nearest 
cart (the one that will instantiate role “Near”). 

The categorical framework also allows one to make precise when an n-ary 
connector can be applied to components Pi,...,Pn, namely when morphisms 
L i : Ri ^ Pi exist. This corresponds to the intuition that the “actual arguments” 
(i.e., the components) must instantiate the “formal parameters” (i.e., the roles). 

An architecture is then a configuration where all components (i.e., programs) 
interact through connectors, and all roles are instantiated. It follows that any 
architectnre has a semantics given by its colimit. 

Proceeding with our action subsumption connector, the roles omit the ini- 
tialisation condition of the location attribute, so that they can be instantiated 
with any particular cart. Figure 1 shows a simple architecture (consisting only 
of the application of the connector to carts 3 and 5, assuming the latter is nearer 
to the crossing) and the resulting colimit. 



2.5 Reconfiguration 

Since a diagram in a given category is a graph with nodes labelled by objects 
of that category and arcs labelled with morphisms, the algebraic graph trans- 
formation approach can be directly applied to architecture reconfiguration. A 
reconfiguration rule is simply a graph production where the left-hand side, the 
interface, and the right-hand side are architectures. A reconfiguration step is a 
direct derivation from a given architecture G to an architecture H. In the al- 
gebraic graph transformation approach, there is no restriction on the obtained 
graphs, but in reconfiguration we must check that the result is indeed an archi- 
tecture, otherwise the rule (with the given match) is not applicable. For example, 
two separate connector addition rules may each be correct but applying them 
together may yield a non-architecture. 
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Returning to our example, if we want to add a counter to a cart, no matter 
which connectors it is currently linked to, we just superpose the CartWitliLaps 
program on it, with l the morphism shown in Section 2.2: 

Cartj • 1 ^ ^ Cartj • 1 : > Cart* • 1 — - — iCartWithLapSj • 2 



3 Community with State 

It is fairly easy to extend all the previous definitions in order to take program 
state into account, thus permitting the specification of dynamic reconfiguration. 
We first add a fixed set Vof typed variables to the algebraic data type specifi- 
cation. The language of terms Termsis then defined as usual from the variables 
Vand the function symbols of the data type specification. In the remaining we 
assume V= {x,y : int}. 

3.1 Program Instances 

A program instance is defined as a program together with a valuation function 
£ : W Term.s that assigns to each local attribute a term of the same sort. 
Two explanations are in order. 

First, the valuation may return an arbitrary term, not just a ground term. 
Although in the running system the value of each attribute is given by a ground 
term, we need variables to be able to write rules whose left-hand sides match 
against components with possibly infinite distinct combinations of values for 
their attributes. 

The second point worth noticing is that terms are not assigned to external 
attributes. This contrasts with our previous approach [12] and there are sev- 
eral reasons. The pragmatic one is that an external attribute is often a local 
attribute in some other component, and therefore there is no need to duplicate 
the specification of its value. The conceptual reason is that in this way we only 
represent the state that is under direct control of the component. The absence 
of the external attributes’ values makes clear that they may change at any mo- 
ment. There is also a technical reason. Reconfigurations change the connectors 
between components. This entails that an external attribute may become shared 
with a different local attribute from a different component. If external attributes 
would have explicitly represented state, the reconfiguration rule would have to 
change their state. However, that is not possible since graph morphisms do not 
change the labelling (given by the program instances). 

A consequence of this choice is that if the application of a rule depends on 
the value of an external attribute, then the left-hand side must also include the 
component containing the local attribute that provides the value. This may seem 
a drawback, reqTiiring the user to write a more complex rule than necessary, but 
it is actually a benefit, since it forces the rule to be applied only in well-behaved 
states of the architecture, namely when the value of the attribute is under control 
of the system, not of the environment. 
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We represent program instances in tabular form. For example, assuming that 
“abs” returns the absolute value of an integer, 



CartWithLapSi 


location 

laps 


start{i) — 1 
abs (a;) -|- 1 



represents a cart that has completed at least one lap and will complete another 
one in the next step. 



3.2 Instance Morphisms 



A morphism between program instances is simply a superposition morphism 
that preserves state. To be more precise, given a superposition morphism cr : 
P ^ P , the algebraic data type axioms must entail t{w) = e [a (w)) for any 
local attribute w of P and any variable substitution. Hence the two terms must 
have the same variables, and therefore the relationships between the attributes 
of P arc maintained by P . 

The example of Section 2.2 using the above program instance could become 



Cartj 

T — 2 + start{i) + 1 



ly-^location ^ 

movei-^{move^pass} 



CartWithLapSj 


location 


start(i) — 1 
abs (a;) -1- 1 


laps 



3.3 Architecture Instances 

It is obvious that program instances and their morphisms form a category since 
the equality of terms is reflexive and transitive. Moreover, every diagram can 
be transformed into a diagram in the category of programs and superposition 
morphisms just by omitting the valuation function. Thus the definitions of con- 
figuration instance, connector instance, and architecture instance are trivial ex- 
tensions of those presented in Sections 2.3 and 2.4. This means that an architec- 
ture instance must be well-formed and hence two different local attributes have 
no conflicting valuations. Therefore every architecture instance has a colimit, 
given by the colimit of the underlying architecture together with the union of 
the valuations of the architecture instance. 

3.4 Dynamic Reconfiguration 

Dynamic reconfiguration basically is a rewriting process over graphs labelled 
with program instances (i.e., architecture instances) instead of just programs. 
This ensures that the state of components and connectors that are not affected by 
a rule does not change, because labels are preserved, thus keeping reconfiguration 
and computation separate. However, we must guarantee that new components 
are added in a precisely determined state in order to be able to perform com- 
putations right away. For that purpose, we require that the variables occurring 
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on the right-hand side of a rule also occur on the left-hand side. Furthermore, 
dynamic reconhguration rules depend on the current state. Thus they must be 

I V t 

conditional rewrite rules L < — K — > i?, if B, with B a proposition over the 
variables of L. 

Within the algebraic graph transformation framework it is possible to define 
conditional rules in a more uniform way, using only graphs and graph morphisms 
[14]. However, for our representation of components it is simpler, both from the 
practical and formal point of view, to represent conditions as boolean expressions 
over state variables. 



Returning to our example, to avoid collisions we give a rule that applies the 
action subsumption connector to two carts that are less than 3 units away from 
the crossing: 



Cart, 



Cart, 



/I \ a^ab 

(I a) ^ 



G 






h^raove 



/ T* 


Far 




Near 


^ ^ : > 




m 




nl 


X 



Cartj 




Cartj 








ly 





fl^l 



move^move move^move 






Cart, 



y 



Cart, 



if0<7 — a:<28 — j/<3V0<28 — x<7 — y<3 



where the graph morphisms I and r are obvious. The opposite rule (with the 
negated condition) is also necessary to remove the connector when it is no longer 
needed. 

The definition of reconhguration step must be changed accordingly. At any 
point in time the current system is given by an architecture instance whose valu- 
ations return ground terms. Therefore the notion of matching must also involve 
a compatible substitution of the variables occurring in the rule by ground terms. 
Applying the substitution to the whole rule, we obtain a rule without variables 
that can be directly applied to the current architecture using the normal def- 
inition of derivation (i.e., as a double pushout over labelled graphs). However, 
the notion of state introduces two constraints. First, the substitution must ob- 
viously satisfy the application condition. Second, the derivation must make sure 
that the state of each program instance added by the right-hand side satishes 
the respective initialisation condition. The complete definition is as follows. 

I V t 

Given a rule L < — K — > R if B, an architecture instance G, and a substi- 
tution (p from variables of L into ground terms, a reconfiguration step obtains H 

I r 

as a direct derivation from G using rule (p{L) i — 0(AT) — > 9^(7?) with a match 
m, if 4>{B) is true and if for every program instance (P, e) G P — r{K), <p{e{I)) 
is true. 
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As in the example above, where the added subsumption connector has no 
initialisation conditions in the glue and roles, it is often possible to prove that a 
rule will introduce program instances in a valid initial state for any substitution 
(j). Thus the run-time check for each reconfiguration step becomes unnecessary, 
leading to a more efficient implementation. 

As a final example of a dynamic reconfiguration, we present an alternative 
way of adding a counter to a cart. The following rule replaces a Cart program 
by a CartWithLaps program (instead of superposing it, as in Section 2.5). 



Cartj 


f Ch 




1 X 




7 



CartWithLaps* 



location 


X 


laps 


-1 



Notice that the double-pushout approach guarantees that a cart is replaced 
only when it is not connected to any other component. This is important both for 
conceptual reasons — components are not removed during interactions [16] — as 
technical ones: there will be no “dangling” roles. 

Notice also that the rule allows the description of the transfer of state from 
the old to the new component. In this case it is just a copy of value x, but in 
general, the right-hand side may contain arbitrarily complex terms that calculate 
the new values from the old ones. 

The drawback of this replacement rule is that connectors would be applied 
directly to CartWithLaps programs, and not just via Cart programs. Thus a new 
version of the subsumption connector addition rule (with node labels “CartWith- 
Laps” instead of “Cart”) would be necessary. Hence, for our example the rule 
of Section 2.5 is preferable, but in cases where there is no morphism between 
the replaced and the replacing component, i.e., the new component is not a 
superposition of the old one, a substitution rule is the only solution. 



3.5 Coordination 

An architecture instance is not just a labelled graph, it is a diagram with a 
precise semantics, given by its colimit. Formally, we can define a computation 
step of the system as being performed on the colimit and then propagated back 
to the components of the architecture through the inverse of their morphisms to 
the colimit. This keeps the state of the program instances in the architectural 
diagram consistent with the state of the colimit, and ensures that at each point 
in time the correct conditional rules are applied. As [21,15] we adopt a two- 
phase approach: each computation step is followed by a dynamic reconfiguration 
sequence. In this way, the specification of the components is simpler, because 
it is guaranteed that the necessary interconnections are in place as soon as 
required by the state of the components. In our example, a cart simply moves 
forward without any concern for its location. Without the guarantee that a action 
subsumption connector will exist whenever necessary, a cart would have to be 
aware of its surroundings to be sure it would not collide with another cart. And 
this would make the program much more complex. 
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When reconfigurations do not occur frequently, it is worth computing ex- 
plicitly the coliniit. For COMMUNITY programs, it suffices to compute (disjoint) 
unions of sets (of attribute names, of assignments, and of propositions for the 
guards and the initilisation condition) and efficient methods exist [37]. However, 
computation steps can be performed directly on the architecture instance. Sim- 
ply choose an action of some component and check its guard. If it is false, the 
computation step terminates. Otherwise follow the morphisms to find all the ac- 
tions that arc synchronised with it, checking each guard before finding the next 
action. If all found actions have true guards, their assignments are executed. As 
the evaluation of guards (and assignments) may depend on external variables, 
it is useful to keep the equivalence classes of attributes (computed by the find- 
union algorithm over the data view). In that way, for each external attribute 
is is efficient to see which local attribute holds the value. A further efficiency 
gain is obtained by computing the data view in parallel with the reconfiguration 
process, since each reconfiguration rule induces a rewrite rule on the data view 
graph. 



4 Concluding Remarks 



This paper presents an algebraic foundation for a relevant software engineering 
practice: the reconhguration of software architectures. All formalisms strive for 
expressivity, conceptual elegance (including simplicity and uniformity), suitabil- 
ity, and implement ability. We think that our approach strikes a better balance 
than previous work [15,24,1,2,12,33]. 

Through the use of Category Theory, software architectures and their recon- 
figuration arc both represented in a graphical yet mathematical rigorous way 
at the same level of abstraction, resulting in a very uniform framework. Fur- 
thermore, computations and reconfigurations are kept separate but related in an 
explicit, simple, and direct way through the colimit construction. 

The used program design language is at a higher level of abstraction than 
process calculi or term rewriting, allowing a more intuitive representation of 
program state and computations. Although illustrated with COMMUNITY [18], 
the presented framework is applicable to any language with the usual notion of 
state that has a categorical semantics. In fact, all the definitions (dynamic re- 
configuration rule, program instance, etc.) build just on the language-dependent 
notions of program, superposition, channel, and data view. 

The approach is suitable to formalise several common problems: transferring 
the state during replacement, removing components in a quiescent state [16], 
adding components properly initialized. 

This paper handled only the specification aspects. In future work we will 
adapt the temporal logic developped for COMMUNITY [9] in order to reason 
about the reconfiguration process. Afterwards we will implement the framework, 
either using a rewriting logic implementation [4], or by integrating a library [37] 
into a Community tool that is being developped. 
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Abstract. Consistency is a major issue that must be properly addressed 
when considering multiple view architectures. In this paper, we provide 
a formal definition of views expressed graphically using diagrams with 
multiplicities and propose a simple algorithm to check the consistency of 
diagrams. We also put forward a simple language of constraints to ex- 
press more precise (intra-view and inter-view) consistency requirements. 
We sketch a complete decision procedure to decide whether diagrams 
satisfy a given constraint expressed in this language. Our framework is 
illustrated with excerpts of a case study: the specification of the archi- 
tecture of a train control system. 



1 Multiple Views: Significance and Problems 

Because software must satisfy a variety of requirements of different natures, 
most development methods or notations include a notion of software view. For 
example, rm-odp descriptions [2] include five viewpoints (enterprise, informa- 
tion, computational, engineering, and technology); UML [15] is defined in terms 
of a collection of diagrams such as static structure, statechart, component and 
deployment diagrams. However, these methods do not have a mathematical ba- 
sis. As a consequence, some of their notations may be interpreted in different 
ways by different people. Another drawback of this lack of formalization is the 
fact that it is impossible to reason about the consistency of the views. As we 
will see later, it is quite easy to define views that are in fact inconsistent (either 
internally or because of contradictory constraints on their relationships). 

The relevance of the notion of view for software architectures has already been 
advocated in [12] but, to the best of our knowledge, no existing formally based 
architecture description language really supports the notion of multiple views. 
The key technical issue raised by multiple view architectures is consistency. In 
order to be able to reason about consistency, it is necessary to provide a formal 
definition of the views and their relationships. This is precisely the objective 
of this paper, with the additional challenge to design an automatic method for 
multiple view consistency checking. 

We see a software architecture as a specification of the global organization of 
software involving components and connections between them. Components and 
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connections are associated with attributes whose nature depends on the property 
of interest. Most papers published in the area of software architectures have 
focused on attributes related to communication and synchronization [1,11,4]. In 
this paper, we focus on structural properties and we consider only simple, type- 
like attributes. This context is sufficient both to express interesting properties 
and to raise non-trivial consistency issues that must be properly addressed before 
considering more sophisticated attributes. 

In order to avoid introducing a new language, we start in Section 2 from 
diagrams, a graphical representation which is reminiscent of notations such as 
UML. Wo justify the use of diagrams in the context of software architectures and 
define their semantics as sets of graphs. This notation docs not prevent us from 
defining (either internally or mutually) inconsistent views. A simple algorithm is 
proposed to check the consistency of diagrams. Section 3 illustrates the frame- 
work with a brief account of a case study conducted in collaboration with the 
Signaal company. The goal of the study was the specification of the architecture 
of a train control system involving a variety of functional and non functional 
recpiirements. We provide excerpts of some of the views that turned out to be 
relevant in this context. Diagrams themselves express in a natural way structural 
constraints on the views but it is desirable to be able to express more sophis- 
ticated constraints on views. In section 4, we put forward a simple language 
of constraints to express more precise (intra-view and inter-view) consistency 
requirements. We sketch a complete decision procedure to decide whether dia- 
grams satisfy the constraints expressed in this language. Section 5 compares our 
approach with related work and suggests some avenues for further research. 

2 Diagrams: Notation, Semantics and Consistency 

In this section, we provide a formal definition of diagrams and study their con- 
sistency. Their use to describe software architecture views is illustrated in the 
next section. 

2.1 Graphical Notation 

As mentioned in [1], software architectures have been used by developers for a 
long time, but in a very informal way, just as “box and line drawings” . Typically, 
such drawings cannot be used to detect potential inconsistencies of the archi- 
tecture in a systematic way or to enforce the conformance of the constructed 
software with respect to the architecture. It is the case however that some con- 
ventions have emerged for the graphical representation of software views in the 
area of development methods. These graphical notations were not proposed orig- 
inally to describe software architectures in the accepted sense of the word (e.g. 
they do not include provision for defining the behavior of connectors or compo- 
nent interfaces) . However, they are rich enough to be considered as a good basis 
for the specification of the overall organization of a software. So, rather than 
crafting a language of our own, we decided to use a graphical notation already 
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familiar to many developers. A diagram is a collection of nodes and edges with 
multiplicities, in the spirit of UML class diagrams. F'igure 1 provides an example 
of a diagram. 




Fig. 1. A diagram 



Nodes {A,B,C,D in Figure 1) are connected via directed edges. An edge 
bears a type {a,f3,j,S in Figure 1) and an interval over the natural numbers 
(called a multiplicity) at each end. The intervals [i,j], [i,i], [*,oo[ and [0,oo[ are 
noted i..j, i, f..*, and * respectively. 

Such a diagram represents in fact a class of graphs (called instance graphs). 
Each node and edge in a diagram may represent several nodes and edges in an 
instance graph. The role of multiplicities is to impose constraints on the number 
of connected instance nodes. More precisely, an edge A — - — B specifies that 
each instance o of A is connected via outcoming a-edges to ja {ja & J) instances 
of B. whereas each instance 6 of iJ is connected via incoming a-edges to ib 
(ib e I) instances of A. For example, the graphs of Figure 2 arc valid instances 
of the diagram of Figure 1: 




Fig. 2. Instance graphs 



2.2 Semantics 

In order to be able to reason about diagrams, we provide a formalization of 
the intuitive definition suggested above. Formally, a diagram is represented by 
a structure {Ng, Eg,ms,md) called a generic graph where: 
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— Ng C A/" is a finite set of typed nodes^ {Af denotes the domain of nodes). 

— Eg <Z Af X Ee X Af is a finite set of typed edges (the set Te denotes the domain 
of edge types). If {A,a,B) G Eg then A G Ng and B G Ng] A is called the 
source node and B the destination node. 

— ms, md '■ Af X Te X AA ^ X — if), Q] are functions mapping each edge to the 
multiplicities associated with the source and destination nodes respectively. 
I is the set of (non empty) intervals over Nat. Like UML, we disallow null 
multiplicities ([ 0 , 0 ]). 

The instances of a generic graph are graphs {Ni, Ei) where Ni C Af is the 
set of instance nodes and Ei d N x Te x Af is the set of instance edges. 



SerrfGg) = {Gi \ 3Class :Mi^Ng, Gg # Gi} 

Class 

where {Ng, Eg,ms,md) ^ {M,Ei) iff: 

VA € Ng. 3a € Ni. Class{a) = A (1) 

yE = {A,a,B)€Eg.ya€Ni. 

Class{a) = A => Card{h \ Cla$${b) = B A{a, a, b) € Ei} € md{E) 

WE = (A, a, B) G Eg. Wb G M. 

Class(b) = B => Card{a \ Class{a) = A A (a, a, b) G Ei} G nis(E) 

Vo, 6 G Ni. Class{a) = A A Class{b) = B A (A, a, B) ^ Eg => (a, a, b) ^ Ei (4) 



Fig. 3. Semantics of generic graphs 



Figure 3 describes the semantics of a generic graph as the set of instance 
graphs respecting the multiplicities and type constraints. A graph Gi is a valid 
instance of Gg if and only if there exists a total function Class mapping instance 
nodes to their generic node such that four conditions hold. The first condition 
enforces that each generic node has at least one instance. The two other condi- 
tions enforce the multiplicity constraints as described before. The last condition 
expresses the fact that all the typed edges allowed in instance graphs are those 
described in the generic graph. 

2.3 Consistency Checking 

It is important to realize that a diagram can represent an empty set of graphs. 
In this case, we say that the diagram is inconsistent. 

Definition 1 Consistent{Gg) = Sem(Gg) 7 ^ 0 

2 “ 1 

For example, the diagram is inconsistent. The reason lies 

1 P 1 

^ Node types do not play any role in this section. They are used to distinguish different 
kinds of entities in diagrams defining software architecture views (Section 3). 
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in the contradiction in the specification of multiplicities: the multiplicities of 
the a-edge imply that there must be two instances of A for each instance of 
B whereas the multiplicities of the /3-edge imply that there must be a single 
instance of A for each instance of B. 

In this case, inconsistency may look obvious but it is not always easy to 
detect contradictions in more complex diagrams. Fortunately, consistency can 
be reduced to the satisfiability of a system of linear inequalities. This system is 
derived from the generic graph, using the formula described in Figure 4. 



Sat{(Ng,Eg,ms,md}) = '^{xn > i}NeNg such that 



A 

E={A,a^B)EEg 



XA > lA 
Xb > 

3: A JB > Xb lA 
Xb JA > XA iB 



where 



[iA,3A] = ms{E) 
[«s, js] = md(E) 



Fig. 4. Consistency as integer constraint solving 



In Figure 4, a variable xn represents the number of instances of a generic 
node N . The semantics enforces that each node has at least one instance so each 
variable xn must satisfy the constraint xn > 1- Furthermore, for each generic 
edge 



[iA,3A] a [iB,jB]^ ^ 

four constraints between the number of instances and the multiplicities are pro- 
duced. These constraints will be justified in the proof below. For a generic graph 
with n nodes and e edges this produces a system of 4e-|-n linear inequalities over 
n variables. Standard and efficient techniques can be applied to decide whether 
such a system has a solution. 



2 “ 1 

We now apply the consistency check on the simple diagram ' B 

that was declared inconsistent at the beginning of this section. ^ /3 1 

The edge A ^ B raises constraints xa > 2 xb and xa < 2 xb which are 

equivalent to xa = 2 xb- The edge A-^ — - — -B raises contraints xa > xb and 
Xa Xb which impose xa = xb- Thus, the system of linear inequalities derived 
from the diagram requires that xb = 2 xb- Together with the constraint xa > 1 
and Xb > 1, this equation has no solution. Hence, Sat returns false and we can 
conclude that the diagram is inconsistent. 

It remains to prove that Sat[Gg) provides a necessary and sufficient condition 
for the consistency of Gg. 
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Property 1 Consistent{Gg) <t4 Sat{Gg) 

Proof. 

(a>) If Sem,{Gg) / 0, there exists an instance graph Gi and a function Class 
such that conditions (1), (2), and (3) of Figure 3 are satisfied. For each node A 
of Gg we note Xa the number of instances of A occurring in Gi and show that 
the Xa’s form a solution to the system of constraints. 

• Condition (1) implies that Xa > 1 for each A G Ng. 

■ For each edge — - — [»b ^ condition (2) implies that Xb > is 

and condition (3) that Xa > tA- 

■ Condition (2) (resp. (3)) implies that each instance of A (resp. B) is connected 

to at least is (resp. and at most js (resp. ja) instances of B (resp. A). 
Let be the total number of a-edges between instances of A and B, we 

A A 

have Xa is < < Xa ]b and Xb %a < E^^b < Xb ]a and therefore 

Xa ]b > Xb ia and Xb Ja > Xa ib 

( 4 =) We assume that Sat{Gg) holds and construct an instance graph Gi = 
{Ni,Ei) such that Gi G Sem{Gg). 

From Sat{Gg) we can associate with each node A in Ng a number of instances 
Xa respecting the whole set of constraints. We take a set of nodes Ni and a 
total function Class : Ni 1 — >■ Ng such that VAf G Ng^ Card{ni G Ni \ Class{ni) = 
N'\ = Xai ■ Since X^i > 1, condition (1) of Figure 3 is verified. 

The number of instance nodes being fixed, we can reason locally and show that 
for each edge ^ ^ ^ produce instance edges re- 

specting conditions (2) and (3) of Figure 3. From the constraints, the interval 
[Xa ibtXa jB]C[XB iA,XB Ja] cannot be empty. We choose a value E^b of this 
interval as the number of a-edges between the instances of A and B. We attach 
to each instance of A (in turn and cyclically) one edge to a new arbitrary instance 

A A 

of B until the E°^b edges arc placed. Since Xa iB ^ < Xa is, this process 

ensures that condition (2) holds. Now, if an instance bk of B has more than Ja 
(resp. less than incoming a-edges we know from Xb iA ^ E'^b < Xb Ja 
that there exists another instance 6/ of B having less than ja (resp. more than 
iA) incoming a-edges. We switch the destination of one incoming a-edge from 
bk to bi (resp. bi to 6^). This process can be repeated until condition (3) holds. 



2.4 Solid Edges 

From our experience, the notation described above is too imprecise to define soft- 
ware architecture views. For example, it is not possible to express the constraint 
that each instance of a node A is doubly linked to an instance of B. Indeed, the 
diagram 



1 a 1 




1 /91 



a P 

y \ 

Ol 0,2 

\ / 

r “ 



accepts 
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as a valid instance. Although our graphical notation can specify graphs with 
properties such as “for all instances of A there is a simple^ (or length k, a- 
typod, . . .) path to an instance of B” , it does not have the ability to enforce 
properties such as “there is a simple path from each instance of A to itself” . 
This makes some sharing patterns impossible to describe. While this may not 
be a problem for diagrams describing the organization of data in UML structural 
views, a greater precision in the specification is often desirable for many typical 
software architecture views (such as the control or physical views). 

We extend our notation by introducing the notion of “solid edges” . They are 
represented as bold arrows as shown in the diagram of Figure 5. 




Fig. 5. A diagram with solid edges 



Solid edges bear implicitly the multiplicity 1 at both ends. The intention is 
that the structure of the region delimited by solid edges should be reflected in 
the instances of the diagram. For example, the diagram of Figure 5 accepts the 
graph of Figure 2 (a) as a valid instance but not the graph of Figure 2 (b). 

The semantics of diagrams must be extended to take into account this new 
feature. A generic graph is now represented by the structure {Ng. Eg, m,s, mj,, s) 
where the function s : J\f xTe^Af ^ Bool is such that s{E) ^ E is a solid edge. 
We assume that s{E) ^ ms{E) = [1, 1] A md{E) = [1,1]. A new condition (5) is 
imposed in the semantics of Figure 3. 



Va, b 6 Ni. 

where 
solid{a, b) 



V 



Class{a) = A f\ Class{b) = B 
A solid{a, b) A s(A, a, B) 



=> (a, a, b) G Ei 



{a = b) 

3{ai,ai,bi), . . . ,{ak,ak,bk) & Ei such that aj 



a, bk = h, 



Ai=i s(Class{ai),ai, Class(bi)), 

= tti+i) V (oi = bi+i) V {bi = ai+i) V {bi = Ih+i)) 



( 5 ) 



This condition ensures that for each connected region in the instance graph 
corresponding to a solid region in the generic graph, the Class function is a 
bijection. So, each solid region must have only isomorphic images in the instance 

a path where any nodes x and y are such that Class{x) yt Class{y) 



2 
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graph. Consistency checking is not affected by this extension. It can be done as 
before by considering solid edges as standard edges with multiplicities 1 and 1. 
This is expressed formally as follows: 

Property 2 

Sem{{Ng, Eg, mg, rud, Xx. false)) / 0 Sem[{Ng, Eg, nig, rud, s)) ^ 0 

Here, Gg = {Ng, Eg, mg,md, Xx. false) denotes the generic graph of a diagram 
where all solid edges are considered as standard edges with multiplicities 1 and 
1. The proof proceeds by constructing from any valid instance Gi of Gg another 
valid instance graph satisfying condition (5). First, all edges of Gi corresponding 
to solid edges are removed. From the constraints of Figure 4, we know that all 
nodes {^ 1 , ■ ■ ■ , Ak} connected by edges of multiplicities 1 and 1 in Gg have the 
same number of instances (say p). For all largest sets of nodes {Ai, . . . ,Ak} con- 
nected by solid edges in Gg, the corresponding instance nodes can be split into 
p sets {oi, . . . ,ak} where Clas.s(ai) = Ai. For each such set, the previously re- 
moved edges are replaced isomorphically to the solid sub-graph. The multiplicity 
constraints are respected and condition (4) holds. 

3 Application to the Design of a Train Control System 

We illustrate our framework with a case study proposed in [5] concerning the 
specification of the architecture of a train control system. We propose a multiple 
view architecture defined as a collection of diagrams - one per view, plus one 
diagram to describe the correspondences between views. The following is an 
excerpt of [5] identifying the main challenges of the case study: 

“Generally, a control system performs the following tasks: processing 
the raw data obtained from the environment through sensing devices, 
taking corrective action [...]. In addition to the functional requirements 
of these systems, many non-functional requirements, such as geographi- 
cal distribution over a possibly wide variety of different host processors, 
place constraints on the design freedom that are very difficult to meet. 

In practice, there are many interrelated system aspects that need to be 
considered. [...] Although solutions are available for many of the prob- 
lems in isolation, incompatible, or even conflicting premises make it very 
difficult to cover all design aspects by a eoherent solution.” 

We present here a simplified version of the architecture of a train control 
system focusing on the process of corrective actions. We distinguish two parts 
in the system: the trains and a control system monitoring the traffic. In our 
solution, each train computes and performs speed corrections with respect to 
three parameters: its route (a detailed schedule including the reservation dates 
of each track section), the railway topology, and its actual state (speed and 
position) as indicated by its sensing devices. Trains periodically send their states 
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to the control system which extrapolates from the collected states to detect 
future conflicts. Once identified, conflicts are solved and new routes are sent to 
trains. 

The train control system is described as a multiple view architecture. In the 
simplified version, the architecture consists of three views: the distributed func- 
tional view (dfv), the distributed control view (dcv), and the physical view 
(phv). In our framework, each view is described by a diagram, and correspon- 
dences between views arc established through an additional diagram. From these 
related views, we show how the requirements listed above can be addressed in 
our framework. 

In each of the three views, we distinguish two parts consisting of nodes con- 
nected by solid edges: the train part and the control system part. These two 
parts themselves are connected by standard edges with multiplicities * and 1. 
It allows several trains to be connected to the control system. Each view is de- 
scribed by a diagram belonging to a given style. A style is dehned here as a set 
of node and edge types together with type constraints such as: edges of type R 
(for read) can only be used to connect nodes of type data to nodes of type pro- 
cess. Different views use different types so that a node type indicates without 
any ambiguity the view to which the node belongs. Types are associated with a 
graphical notation dehned in the caption of each view. For the sake of clarity, 
we use expressions in typewriter font (such as process speed correction) to 
denote node names. 




Fig. 6. The distributed functional view 



The distributed functional view (Figure 6) describes the data flows and data 
dependencies between processes, using four node types and three edge types. 
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SENSOR and actuator nodes represent the input and output ports of the system, 
PROCESS nodes correspond to entities of computation, and data nodes correspond 
to variables, sensor, actuator and process nodes can be connected to data 
nodes by edges of type r or w denoting respectively the potential to read or 
write a variable. An edge of type m represents message passing between two 
process nodes. 



the train part 



the control system part 




DCV caption: Q SOURCE □ TASK O SINK O PLACE T (trigger) edge 



Fig. 7. The distributed control view 



The distributed control view (Figure 7) describes the scheduling of processes 
and the control flow in the spirit of Petri nets. It uses four node types and one 
edge type. Nodes of types source, sink, and task can be seen as transitions in a 
Petri net. Our intention is to draw a parallel between those node types and the 
node types of the distributed functional view (sensor with source, actuator 
with SINK, and process with task), place nodes correspond to repository of 
tokens in Petri nets. They are used for description purpose only and have no 
corresponding nodes in the other views. Nodes of the distributed control view 
are connected by triggering edges of type t. 




Fig. 8. The physical view 
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The physical view (Figure 8) describes the network connections between com- 
puters. It has two types of nodes, computer and network, and two types of 
edges: c which denotes a connection between a computer and a network, and l 
which represents a connection between networks. 




Fig. 9. DEV, DCV, and phv correspondences 



It is easy to check that the multiplicity constraints can be satisfied; so that 
each view is consistent. The different views of the system are not unrelated, 
though. In our framework, the inter-view relationships are expressed through 
edges of a specific type, called map. For example, an edge {P, map, C) between 
node P of type process and node C of type computer indicates that the process 
P of the distributed functional view is mapped onto the computer C of the 
physical view. The map relations between the three views are described in an 
additional diagram (Figure 9) which contains nodes of the three views to be 
related. Each node of type process (resp. sensor, actuator) in the distributed 
functional view is mapped onto a node of type task (resp. source, sink) with 
the same name in the distributed control view, data nodes do not have any 
counterpart in the distributed control view. All nodes of types data, process, 
sensor, and actuator in the train part of the distributed functional view are 
mapped onto the node named train computer in the physical view. The other 
ones are mapped onto one of the nodes computer 1 and computer 2 in the 
control system part of the physical view. All nodes of types task, source, and 
SINK in the train part (resp. the control system part) of the distributed control 
view are mapped onto nodes of type computer in the same part of the physical 
view. 

The diagram of Figure 9 can be superimposed on the three diagrams of Fig- 
ure 6, 7 and 8 to form the complete diagram defining the software architecture 
of the system. Consistency checking (Section 2.3) of the complete diagram en- 
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sures that the family of architectures described by the three related views is not 
empty. 

4 Specification and Verification of Constraints 

The graphical notation introduced in Section 2 is well-suited to the specification 
of the overall connection pattern of each view and the correspondences between 
their components. However, this notation generates only simple constraints on 
the number of occurrences of edges and nodes. In the context of software ar- 
chitectures, it is often desirable to be able to impose more sophisticated (both 
inter-view and intra-view) consistency constraints. In our case study, for exam- 
ple, we would like to impose that a process of the distributed functional view 
and its corresponding task in the distributed control view are mapped onto the 
same computer in the physical view. Another requirement could be that each 
process must be on the same site as any data to which it has read or write access. 

To address this need, we propose a small language of constraints and define 
a complete checking algorithm for the generic graphs introduced in Section 2. 
We illustrate the interest of this language with the case study introduced in 
Section 3. 

4.1 A Simple Constraint Language 

The syntax of our constraint language is the following: 

C ::= Va:i : Ti, . . . , : r„. P 

P Pi \ PiV P 2 \ edge{x,a,y) \ pathE^{x,y) \ ^P 

Et C PgUPe 

We also use => in the following but we do not introduce it as a basic connector 
since it can be defined using V and The semantics of the constraint language 
is presented in Figure 10. A generic graph Gg satisfies a constraint C if all its 
instances Gi satisfy the constraint. Constraints path^^{x,y) are defined with 
respect to a set of edge types £1*. = {5 | a e 7^} is a set of annotated types 

used to accept inverse edges in paths. For example, path ^ ^ (a, b) is true if there 
exists an undirected path, that is a path made of undirected a-edges, between 
nodes a and b. Note that we consider only simple paths of the instance graphs. 
Simple paths correspond to non-cyclic paths of the generic graph (condition 
Class{ai) 7 ^ Class{aj) at the bottom of Figure 10). 

Examples of the use of the language to express both inter-view and intra-view 
compatibility constraints are provided in Section 4.3. We now turn our attention 
to the design of a verification algorithm for this language of constraints. 

4.2 A Constraint Checking Algorithm 

The semantics of the language of constraints presented in Figure 10 is not directly 
suggestive of a checking algorithm because it is defined with respect to the 




422 



P. Fradet, D. Le Metayer, and Michael Perin 



G 5 1= (7 44^ VGi 6 Sem{Gg), Gi'r C 

Class 

where Gi = {Ni, Ei) and Gg ^ Gi 



Gi h 
Gi h 
Gi h 
Gi h 
Gi h 



Gi h 



\/xi :n,...,Xn 

Pi A P 2 44- 

Pi V P 2 <=> 

edge{x,a,y) 
pathjs^{x,y) 



-.P 



: Tn- P <=> Vxi : Tl,.. . ,Xn ■ Tn € M. Gi P 

Gi\-Pi A Gi\- P 2 
Gi h Pi V Gi h P 2 
(x,a,y) € Ei 

3(ii , . . . , n^-i-i G i\^. 3oii , . . . , 01^ ^ El . 

01 = a; A au+i = y 

A Vie ((aj,a,,a,+i) € Pi A a, € %) 

V ((ai+i,aj,ai) € Pi A^ € 7^) 

A Vi, j e [ 1 , + 1 ]. i ^ j ^ Classiai) ^ Classiaj) 
Gi \fP 



Fig. 10. Semantics of the language of constraints 



(potentially infinite) set of all the instances Gi of a generic graph Gg. In this 
subsection, we sketch a checking algorithm. Check, and provide some intuition 
about its correctness and completeness, stated as follows: 

Property 3 Check {C, Gg) <=t Gg |— C 

Space considerations prevent us from presenting the algorithm thoroughly; the 
interested reader can find a detailed account of the algorithm and proofs in [8]. 

The Check procedure outlined in Figure 11 takes two arguments: a property 
Vxi : Tl, . . . : t„. P in canonical form and a generic graph Gg. Properties in 

canonical form arc properties in conjunctive normal form with negations bearing 
only on relations edge{x, a, y) and path^^{x, y). The transformation of properties 
into their canonical form is straightforward. 

The first step of the algorithm consists in considering all the possible classes 
Xi (denoting nodes of the generic graph) consistent with the types of the vari- 
ables Xi. The interesting case in the definition of Verif is the disjunction which is 
based on a proof by contradiction. The function Contra returns true if its argu- 
ment contains a basic property and its negation. The function Gen is called with 
three arguments: the context Class, which records the class of each variable, the 
generic graph Gg and the set of basic properties^ {^Bi , . . . , -^Bn}. Gen is the 
core of the algorithm: it generates all the basic properties that can be derived 
from this initial set. Gen is defined as a composition of intermediate functions: 

— GenPe expands path{x, y) properties into sequences of edges by introducing 
fresh nodes variables. All the simple paths between Class{x) and Class(y) 
in the generic graph Gg are considered. As a consequence, GenPe returns a 
set of tuples {Class, Gg, S) which corresponds to a logical disjunction. This 

Basic properties are properties of the form edge{x,a,y),path^^{x,y) or their nega- 
tion. 



3 
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Check(Va;i : ti, . . . ,x„ '■ Tn. P, Gg ) = f\ Verif([xj i-> Xi\, Gg, P) 
where Gg = {Ng, Eg, iris, rrid, s) 

Yeni{Class,Gg, Pi A. . .A Pn) =yeni{Class,Gg, Pi) A. . .Ayen{{Glass,Gg, Pn) 
yerii{Class, Gg, Bi V. . . V Bn) = Contra(Gen({(7toss, Gg, {-^Bi, . . . , -iB„}))) 

where Contra(S) = 35 G S. -iB € S 

Gen{(Class, Gg, S}) = Lub o GenNi* o GenPi* 

o iter (GenSub* o GenEq* o GenEd* o GenSo*) 
o GenNeg* o GenPe((CZass, Gg, S}) 
where /*(s) = {f{x) | r € s} 

if /(*) = * 

^ iter f f{x) otherwise 

Luh{{{Classi,Gg, Si),..., {Cla$s„, Gg, SW)}) = f] “'Contra(Sj) 



Fig. 11. Constraint checking algorithm 



justifies the use of the /* notation to ensure that the subsequent functions 
are applied to all the tuple elements of this set. The Lub function is then 
used to compute the intersection of all the resulting (non contradictory) sets; 
it corresponds to disjunction elimination in an inference system. 

— GenNeg exploits the generic graph and condition (4) of the semantics of 
generic graphs (see Figure 3) to produce all valid -ledge and ^path relations. 

— GenSo computes the relation solid as defined in Section 2.4 and GenEd applies 
condition (5) to derive all possible new edges. 

— GenEq exploits the multiplicities in the generic graph to derive all possible 
equalities between nodes. For example, if the generic graph is such that 

md(Class{x),a, Class(y)) = 1 

then GenEq derives yi = y 2 from edge{x,a,yi) and edge{x,a,y 2 )- 

— GenSub uses the equalities produced by GenEq to derive new properties (for 
example edge{x 2 ,cx,y) can be derived from xq = xq and edge{xi,a,y)). The 
application of GenSub can lead to the derivation of new solid relations by 
GenSo, hence the iteration. The termination of iter is ensured by the fact 
that no new variable is introduced in the iteration steps (so only a finite 
number of basic properties can be generated by Gen) . 

— GenPi generates the path properties implied by the edge properties. 

— GenNi generates negations that follow from deduction rules such as: 

edge{x,a,y) A ^path^^ ^ {x,z) ^path^^ ^ {y,z). 

It can be applied as a hnal step since negations are not used by the proceeding 
functions. 
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The correctness and completeness proofs described in [8] are based on an 
intermediate inference system (which is itself complete and correct with respect 
to the semantics of generic graphs and the language of eonstraints). Correctness 
of the algorithm is straightforward. Completeness and termination rely on the 
application ordering of the intermediate functions of Gen. The proof uses the 
fact that an intermediate function F cannot generate a property which could be 
used (to derive new properties) by a function F that is not applied after F. 

The Check procedure outlined here is a naive algorithm derived from the 
inference system. We did not strive to apply any optimisation here. In an effective 
implementation, the intermediate functions of Gen would be re-ordered so as to 
detect contradictions as early as possible (and the algorithm would stop as soon 
as two contradictory properties are generated). 

4.3 Automatic Verification of Compatibility Constraints 

We return to the case study introduced in Section 3 and we show how inter-view 
and intra-view compatibility relations can be expressed within our constraint lan- 
guage. We eonsider three constraints involving the distributed functional view, 
the control view and the physical view. In this subsection, we let Gg stand for 
the generic graph {Ng,Eg,ms,md,s) made of all the (related) views defined in 
Section 3. 

I. Process- Task consistency constraint: “A process of the distributed func- 
tional view and its corresponding task in the distributed control view must 
be placed on the same computer in the physical view.” The mapping be- 
tween DFV and Phv associates data with the processes that use them, while 
the mapping between DCV and Phv is driven by concurrency concerns. The 
compatibility constraint imposes that a tradeoff must be found that agrees 
on the placement of processes on the available computers. It is expressed as 
follows in our language: 

Vp : PROCESS, t : TASK, C : COMPUTER. 

edge{p, map, t) A edge{p, map, c) -> edge{t, map, c) 

The canonical form of this constraint is: 

Vp : PROCESS, t : TASK, C : COMPUTER. 

edge{t, map, c) V ^edge{p, map, t) V ^edge{p, map, c) 

The application of the Check procedure to this constraint and Gg results in 
a call to: 

Contra(Gen((C'/a,s,s, Gg, {^edge{t, map, c), edge{p, map, t), edge{p, map, c)}))) 

Then, whatever the mapping of the nodes variables p, c, t on class nodes 
of the complete diagram, Gen adds the relation edge{t, map, c) to the initial 
set. It results in a contradiction with -^edge(t, map, c) and Check returns true. 
The relation edge{t, map, c) is generated by the Genso and GenEd functions 
from the relations edge{p, map, t) and edge{p, map, c) using the fact that the 
MAP edges are solid edges. 
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2. Site consistency constraint: “A process should be on the same site as 
any data to which it has read or write access. A site is taken as a set of 
computers linked by a local network. ” This second constraint is expressed 
as follows in our language: 

Vp : PROCESS, d : data, mi, m2 : computer, n : network. 

(^edge{p, R, d) V edge{p, w, d)) 

A edge{p, map, mi) A edge{d, map, m2) A edge{m,2, c, n) 

=> edge{nii,c, n) 

The application of the Check procedure shows that this constraint does not 
hold. After translation of the contraint into its canonical form, Verif has to 
examine a conjunction of two disjunctions. One of these disjunctions leads 
to a call to Gen with the initial set of relations: 

{edge{p, R, d), edge{p, map, mi), edge{d, map, m2), edge{m2,C, n),^edge{mi,c, n)} 

Together with the mapping: [p ha process speed correction, d ha railway 
topology, mi train computer, m2 ha computer 2,n ha local network], the 
Gen function does not generate the expected contradiction. The reason lies 
in the r link between the nodes railway topology and process speed 
correction in the distributed functional view (Figure 6) and the fact that 
these two nodes are mapped onto two computers (computer 2 and train 
computer) belonging to two different sites (Figure 9): computer 2 is linked 
to local network and train computer is linked to local train network 
(Figure 8). This example shows that the Check procedure can easily be ex- 
tended to return counterexamples. 



3. Communication consistency constraint: “Two processes communicat- 
ing by message passing are not allowed to share data.” This constraint im- 
poses that message passing is used only for communication between distant 
processes. The constraint is expressed as follows in our language: 
Vpi,Pi,p 2 ,P 2 ■ process, d : data. 

^ ( edge{pi , R, d) V edge{p\ ,w,d) ) 

( edge{p2,^,d) V edge{p2,w,d) ) 

^ podh [j (PitPi) f^po.th p w {P 2 PP 2 ) ) 

This constraint, which is an example of an intra-view consistency relation, is 
not satisfied by the diagram of Figure 6. This is because process prediction 
and process speed correction share the data railway topology and 
still communicate (indirectly) by message passing (following a {r, w, r, w, m} 
path through the nodes send state and update state). 



A edge{pi,u,p 2 ) 



These constraints illustrate that inconsistencies can naturally arise in the 
speciheation of multiple views. Counterexamples produced by the extended Check 
procedure provide useful information to construct a correct architecture. A sim- 
ple solution to satisfy the last two constraints consists in providing each train 
with a local copy of the railway topology data and adding an update mecha- 
nism as was done for the “state” variable (global state and train state in 
Figure 6). The two copies could then be mapped onto two different computers. 




426 



P. Fradet, D. Le Metayer, and Michael Perin 



5 Conclusion 

We have presented a framework for the definition of multiple view architectures 
and techniques for the automatic verification of their consistency. It should be 
noted that we have defined views as collections of uninterpreted graphs. The in- 
tended meaning of a diagram (or a collection of diagrams) is conveyed indirectly 
through the constraints. This uninterpreted nature of graphs makes it possible 
to specify a great variety of views. In Section 3 we just sketched three views used 
in the train control system case study. Following the approach presented in this 
paper, we have been able to address some of the requirements (both functional 
and non functional) listed in [5]. For example, distribution and fault-tolerance 
arc expressed directly as views. Fault-tolerance views arc refinements^ of the 
distribution views and they give rise to constraints like “Data A and Data B 
must not be on the same memory device” or “Process A and Process B must be 
mapped onto two different processors” . 

An interesting byproduct of the work described here is that we can apply 
the algorithm of Section 2 to check the consistency of UML diagrams. UML also 
includes a language called OCL [14] for defining additional constraints on dia- 
grams. OCL constraints can express navigations through the diagrams and ac- 
cumulations of set constraints on the values of the node attributes. Even if our 
language of constraints and OCL are close in spirit, they are not directly com- 
parable because OCL does not include recursive walks through the graph similar 
to our path^^ constraint; on the other hand, we did not consider constraints on 
node attributes. Previous attempts at formalizing certain aspects of uml and 
OCL (based on translations into Z) are reported in [6,9]. In contrast with the 
framework presented here, they do not lead to automatic verification methods. 
Recent studies have also been conducted on the suitability of uml to define 
software architectures. In particular, [10] and [13] present detailed assessments 
of the advantages and limitations of UML in this context. In comparison with 
these approaches, we do not really use UML here, considering only graphical and 
multiplicity notations as a basis for describing the structure of the architecture. 
As mentioned in [10], a graphical notation may sometimes be too cumbersome to 
express correspondences between views. Since our diagrams are translated into 
constraints, we can easily allow a mixed notation (including diagrams and con- 
straints) allowing architects to use the most appropriate means to define their 
views. 

More generally, we put forward a two-layer definition of software architec- 
ture views: the basic or structural layer is defined using a graphical (or mixed) 
notation and the specific or semantical layer is defined on the top of the first 
one, through node and edge attributes. We have only considered simple type 
attributes here because they are sufficient both to express interesting properties 
and to raise non-trivial consistency issues. We are currently working on the in- 
tegration of more sophisticated attribute domains to describe other aspects of 
software architectures (such as performances or interaction protocolcs). 

^ Refinement is defined formally here as the expansion of nodes by subgraphs. 
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Finkelstein et al. have proposed a framework supporting the definition and 
use of multiple viewpoints in system development [7]. They consider views ex- 
pressed in different formalisms (Petri nets, action tables, object structure di- 
agrams, ...). Correspondances between views are defined as relations between 
objects in a logic close to our language of constraints. The logical formulae are 
used as rules to help the designers to relate the views. An important departure 
of the work presented here with respect to [7] is that our consistency checking 
is performed on a family of architectures (which is potentially infinite); their 
verification is simpler since it applies to fixed viewpoint specifications (which 
correspond to a particular instance graph in our framework). To check consis- 
tency, they first require the designers to provide a translation of the views into 
first order logic. Consistency is then checked within this common formalism and 
meta-rules are used to report inconsistencies at the view level. Their philosophy 
is that “it is not possible (or even desirable) in general to enforce consistency 
between all the views at all times because it can unnecessarily constrain the 
development process” . So they put stress on inconsistency management rather 
than consistency checking itself. 

Consistency checking has also been carried out in the context of specifica- 
tion languages like Z, Lotos or Larch [16,3]. The traditional approach, which 
can be called “implementation consisteney” , is summarized as follows [3]: “n 
specifications arc consistent if and only if there exists a physical implementa- 
tion which is a realization of all the specifications, ie. all the specifications can 
be implemented in a single system”. In contrast, our approach decouples the 
issues of consistency and conformance for a better separation of concerns and 
an increased tractability. In this paper, we have been concerned exclusively with 
consistency. Our approach to conformance consists in seeing the software archi- 
tecture views as collections of constraints that can be exploited to guide the 
development process. For example, the functional view specifies constraints on 
the possible flows of data between variables; the control view adds constraints 
on method or procedure calls and their sequencing. We are currently designing 
a generic development environment which takes architectural constraints as pa- 
rameters and guarantees that only programs that conform to these constraints 
can be constructed. In addition to the views already mentioned in this paper, 
this environment should make use of a “development view” , which represents the 
hierarchical organization of programs and data into development units (classes 
in Java). 
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Abstract. Building software systems out of pre-fabricated components is a very 
attractive vision. Distributed Component Platforms (DCP) and their visual 
development environments bring this vision closer to reality than ever. At the 
same time, some experiences with component libraries warn us about potential 
problems that arise in case of software system families or systems that evolve 
over many years of changes. Indeed, implementation level components, when 
affected by many independent changes, tend to grow in both size and number, 
impeding reuse. In this paper, we analyze in detail this effect and propose a 
program construction environment, based on generative techniques, to help in 
customization and evolution of component-based systems. This solution allows 
us to reap benefits of DCPs during runtime and, at the same time, keep 
components under control during system construction and evolution. In the 
paper, we describe such a construction environment for component-based 
systems that we built with a commercial generator and illustrate its features 
with examples from our domain engineering project. The main lesson learnt 
from our project is that generative techniques can extend the strengths of the 
component-based approach in two important ways: Firstly, generative 
techniques automate routine component customization and composition tasks 
and allow developers work more productively, at a higher abstraction level. 
Secondly, as custom components with required properties are generated on 
demand, we do not need to store and manage multiple versions of components, 
components do not overly grow in size, helping developers keep the complexity 
of an evolving system under control. 



1 Introduction 

The interest in Distributed Component Platforms (DCP) [6,17,22] grows and vendors 
bring to the market many software development environments for building software 
systems out of pre-fabricated components. Most often components are provided in a 
binary form. By complying to the standards of the underlying DCP, these components 
can be easily interconnected with each other. It is claimed that component-based 
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software engineering (CBSE) will cut development costs due to high level of reuse. 
CBSE will also produce highly maintainable systems: as a system is represented by a 
collection of cooperating components, maintenance will be done by replacing 
components with other components providing richer functionality. This is viewed as a 
potentially simpler task than today’s maintenance of big, monolithic programs. 

DCPs, such as EJB™, ActiveX™ and CORBA^“ implementations, offer many 
advantages for deployment and efficient execution of software systems, particularly 
systems that operate in a distributed environment. However, it still remains to be seen 
too what extent we can build maintainable large scale software systems out of 
implementation level components. Experimental studies should address specific 
questions such as: Which design objectives can be achieved with component-based 
approach and which cannot? What kinds of software systems (and which parts of 
them) can be built with independently developed components? Will a component- 
based system be still under control after a couple of years of maintenance? 

Answers to these questions depend on a specific context in which questions are 
asked. The context also determines the meaning of basic terms of component, 
architecture and CBSE. Eor example, the required properties of components and 
architectures depend on whether we talk about program construction components or 
components of a runtime architecture. In many component-based approaches, 
components play a double role of construction components and also implementation, 
runtime components. However, different issues matter in program construction than 
during the runtime. The main trust in construction architectures and their components 
is flexibility, i.e., the ability to customize components to meet variant requirements of 
products and the ability to evolve over time to meet changing needs of the business 
environment. For construction components, we need to know how different sources of 
change affect them. On the other hand, issues that matter in runtime architectures 
include allocation of functions to components, deciding which logical components 
should be packaged into a single executable, parallel execution of components, data 
communication, invocation of services between components and synchronization. 
Those are two different perspectives that require two different mechanisms to 
effectively deal with related problems. While a unified program construction and 
runtime world prove very useful in rapid application development, we still do not 
have evidence that it is sufficient for development and evolution of large scale 
software systems. 

Certain problems seem to be inherent to approaches in which we attempt to build 
systems out of pre-fabricated implementation components. Over years of evolution of 
a software system and also during customization for reuse, components are affected 
by independent sources of change [2,5]. Sources of change steam not only from new 
(or variant) functional and non-functional requirements, but also from new versions of 
a computing environment such as tools, operating systems and networks. If we need 
maintain a component version for each combination of these variants, components 
will grow in size and number. The cumulative effect of this uncontrolled growth may 
likely become prohibitive to reuse. Repositories with version control to store 
components may ease but will not solve the problem. This phenomenon is known to 
companies who have been using component libraries for some time [5]. 

The above problem can be avoided if we change the focus from the artifact to 
process, from customized components to customization process itself. Generative 
techniques allow one to create on demand a component with a required combination 
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of variants, so there is no need to manage component versions. The basic idea would 
be to keep specifications of how variants affect a generic component separately from 
the very component [2]. A generator interprets specifications of required variations 
and produces a custom component when it is needed. This scenario promises a way to 
avoid uncontrolled growth of components in size and number and also to automate 
some of the routine customization activities. 

In this paper, we illustrate the above mentioned problems with examples from our 
domain engineering project. We show how we alleviated the above problems by 
coupling component-based runtime architecture with construction environment based 
on generative techniques. We think there are some general lessons we learnt from the 
project. Strengths of generative techniques lie exactly where the open problems, and 
perhaps inherent weaknesses, of the component-based approach are. Generators are 
built to provide an effective way of dealing with variations in a domain. Variations 
are characterized in application domain terms, independently of the implementation, 
in particular, independently of the component structure of the runtime system. For 
given variant requirements, a generator can produce a member of a system family that 
satisfies the variants. The knowledge of how variants affect components, how to 
produce customized components and how to assemble customized components into a 
working system is within a program construction environment used by a generator. 
This construction environment includes customizable and compatible components 
organized within a generic architecture and global structures that help deal with 
changes. Coupling component-based approach with generative techniques allows one 
to benefit from distributed component architectures during runtime, without 
sacrificing flexibility required during program customizations and evolution. 

The flow of the paper presentation is as follows. After discussing related work, we 
briefly describe a family of Facility Reservation Systems (FRS) and its component- 
based runtime architecture. We use examples from the FRS family to illustrate claims 
throughout the paper. In section 4, we discuss problems that arise if we develop and 
evolve FRS family in terms of implementation components such as in 
DCOM/ActiveX™, JavaBeans™ and CORBA™ implementations [17,22]. In section 
5, we describe a program construction environment based on generative techniques. 
In this solution, a generator derives runtime components for a custom system from a 
generic architecture of the construction environment. Concluding remarks end the 
paper. 



2 Related Work 

We do not review the literature on CBSE and DCPs as we believe this will be done 
extensively during other conference sessions. The two requirements for component- 
based systems that we emphasize throughout the paper are ease of customization and 
evolution. These two requirements are particularly important in the software system 
family situation and our experiments indeed involved a family of facility reservation 
systems. Software system families arise in situations when we need develop and 
maintain multiple versions of the same software system (for example, for different 
clients). The concept of program families was first introduced by Parnas [23] who 
proposed information hiding as a technique for handling program families. 
Techniques described in his early papers mainly address variations in design decisions 
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across family members. Since then, a range of approaches has been proposed to 
handle different types of variations (for example, variant user requirements or 
platform dependencies) in different application domains. Pre-processing, PCL [27], 
application generators [3,20], Object-Oriented frameworks [14], Domain-Specific 
Software Architectures [28], frame technology [2] and, most recently, distributed 
component platforms [6,17] - they all offer mechanisms to handle variations that can 
be useful in supporting program families. 

Parson and Wand [24] argue that implementation level objects (read: components) 
are not suitable for analysis, as the realm of analysis is different from that of 
implementation. Their discussion hints at problems that are inherent to any approach 
that focuses too early at the implementation components. Bridging the gap between 
domain concepts and implementation has been recognized by many researchers and 
practitioners as a major challenge in software engineering. Much work on software 
design, architecture, domain analysis and application generators stem from this 
premise. The problem can be addressed in top-down direction, from domain concepts 
to implementation components, or in bottom-up direction, with a software 
architecture as a possible meeting point. Top-down and bottom-up research directions 
contribute different insights to the problem. In the context of component-based 
approach, compositional (scripting) languages [21] allow one to group components 
into higher level abstractions that are closer to domain concepts than individual 
components. Domain analysis approaches attack the problem from the top. They 
allow one to identify domain concepts that might correspond to component groups. 
Moving in that direction also provides insights into types of variability that a 
component should be able to accommodate. 

Generative techniques are based on domain analysis. A generator builds custom 
systems from a set of generic, compatible components. A domain-specific generic 
architecture, incorporating compatible components, is the heart of a generator. A 
generic architecture implements commonalties in a domain, while a meta-language 
allows developers to specify variations to be implemented in the custom system. 
GenVoca is a method and tool for building component-based generators. In JST [3], 
based on GenVoca, a meta-language extends existing programming languages with 
domain-specific constructs. JST provides practical way of bridging domain concepts 
and implementation. 

The authors of papers [15] and [26] suggest that generative techniques can alleviate 
some of the problems with component-based systems we discussed in the 
introduction. Boca [10] is a generator for component-based systems. Boca provides a 
meta-language to define business semantics. Business components such as customers, 
orders, employees, hiring and invoicing are specified in the meta-language, separately 
from the runtime program characteristics. All the future changes in requirements can 
be done at the level of meta-descriptions. A meta-language provides means for 
maintaining integrity of requirements for a system family during customization and 
evolution. Boca supports synthesis of component-based runtime systems from 
business and implementation-specific component layers. Digre [10] points out further 
benefits of separating a software construction architecture from the runtime 
architectures in the context of distributed component platforms: runtime components 
contain both business logic and platform-specific implementation details (such as, for 
example, code for sending and listening to events specific to Enterprise JavaBeans™ 
platform). A construction architecture makes it possible to separate business concerns 
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from platform concerns. Not only will this make the design simpler, but also will 
make business components independent of changes to the platform technology. 

Many computational aspects of a program spread throughout the whole program 
and cannot be nicely confined to a small number of runtime components. Examples 
include business logic, platform dependencies or codes that have to do with system- 
wide qualities such as performance and fault tolerance. In aspect-oriented 
programming [16], each computational aspect is programmed separately and a 
mechanism is provided to compose aspects into an executable program. It is 
envisioned that both program development and maintenance will be done in terms of 
independently defined program aspects. 

Ability to deal separately with different computational aspects is, in our view, a 
general requirement for construction environments. Frame technology [2] fragments 
programs into generic components called frames. Frames organized into a hierarchy 
form a generic architecture, from which programs incorporating specific variants can 
be produced. A frame is a text written in any language (e.g. IDL or Java). Frames can 
be adapted to meet requirements of a specific system by modifying frame’s code at 
breakpoints. Each frame can reference lower level frames in the hierarchy. This is 
achieved using frame commands (e.g. .COPY, .INSERT) that can add to, or subtract, 
from lower level frames. This “editing” is performed at distinct breakpoints in the 
lower level frames. Each frame can also inherit default behavior and parameters from 
higher level frames. Frames are organized into frame hierarchies for the purpose of 
constructing a member of a system family. The topmost frame in a frame hierarchy, 
called a specification frame, specifies how to adapt the rest of the frame hierarchy to 
given variant requirements. Therefore, each source of change can be traced from the 
specification frame throughout all the affected frames. A frame processor is a tool that 
customizes a frame hierarchy according to directives written in the specification 
frame and assembles the customized system. Industrial experiences show that while 
building a generic architecture is not easy, subsequent productivity gains are 
substantial [2]. These productivity gains are due to flexibility of resulting 
architectures and their ability to evolve over years. In [9], we described a frame-based 
generic architecture and customization method that we developed for a Facility 
Reservation System family. In this paper, we generalize experiences from our domain 
engineering projects, highlighting the advantages that generative techniques offer to 
component-based systems. 



3 An FRS Family and Its Runtime Architecture 

In this section, we briefly introduce our working example, a family of Facility 
Reservation Systems (FRS for short). 



3.1 A Brief Description of the Facility Reservation System (FRS) Family 

Members of the FRS family include facility reservation systems for offices, 
universities, hotels, recreational and medical institutions. An FRS helps company staff 
in the reservation of rooms and equipment (such as PCes or OHPs). An FRS 
maintains records of reservations and allows users to add new reservations and delete 




434 S. Jarzabek and P. Knauber 



an existing reservation. When reserving a room, the user should be able to specify an 
equipment that he/she needs to be available in the room. 

Here are some of the variant requirements in the FRS domain: 

1. Different institutions have different physical facilities and different rules for 
making reservations. 

2. In certain situations, FRS should allow one to define facility types (such as 
Meeting Room and PC) and only an authorized person should be allowed to add 
new facility types and delete an existing facility type. 

3. The calculation of reservation charges, if any, may be performed according to 
discounts associated with each user and the payment classification of each facility. 

4. Some FRSes should allow users to view existing reservations by facility ID and/or 
by reservation date. 

5. Sometimes an FRS may need maintain a database of all the facilities in a company. 
One should be able to add new facility of a given type, enter facility description (if 
applicable) and delete an existing facility. 

6. Some FRSes should also allow the user to search for available Rooms with the 
necessary piece of equipment. 

7. Most often, users (individuals or companies) manage their own reservations with 
an FRS. In some companies, however, users send reservation requests to a 
middleman who makes reservations for them. In general, users may only perform 
certain functions according to the permissions assigned to them 

Even this simplified description of the facility reservation domain hints variant 
requirements, dependencies among requirements and potential problems with 
handling them during the design and evolution of FRS components. 



3.2 A Component-Based Runtime Architecture for FRS 

We shall aim at the component-based runtime architecture for FRSes, depicted in Fig. 
1 . The architecture is a three-tiered architecture. Each tier provides services to the tier 
above it and serves requests from the tier below. Each tier is built of components. A 
tier itself is yet another large granularity component. The FRS client tier contains, 
among others, components for reservation, user and facility management. These 
components initialize and display Java panels and capture user actions. The middle 
tier, in addition to implementing FRS business logic, provides the event-handling 
code for the various user interface widgets (e.g. buttons). It also contains components 
that set up and shut down connections with the DBMS tier. User interface 
components together with their event-handling code and business logic counterparts 
form higher level components organized using PAC design pattern [12]. For example, 
“Reservation Management Panel”, “Reservation Management Logic” and event- 
handling components together form a higher level component “Reservation 
Management” (not shown in the diagram). 

The FRS client and server tiers communicate via method invocation calls. The 
communication between tiers is detailed in Table 1. 
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Table 1. Description of Connectors 



Connector 


Description 


FRS Client to 
FRS Server 


Method invocations that are triggered as a result of user 
actions (e.g. pressing a button to reserve a facility) on the 
client. Examples are adding a reservation and viewing the 
details of a facility. 


FRS Server to 
FRS Client 


May be reservation, user or facility data that is returned as a 
result of a method invocation. In some cases, error or 
exception data may be returned. 


FRS Server to 
DBMS 


SQL call to retrieve data from one or more databases 
managed by the DBMS. An example of such calls could be a 
request to find the largest existing reservation ID. 


DBMS to 
FRS Server 


Data from the DBMS’ database(s) that is returned as a result 
of some SQL call. 
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4 Supporting an FRS Family with Runtime Component 
Architecture 

A development situation described in this section reflects capabilities of DCPs, such 
as EJB^“ or ActiveX™, and their visual software development environments. An FRS 
runtime architecture is depicted in Fig. 1. Facilities are provided to introspect and 
customize components [17]. 



4.1 Addressing Variant Requirements 

To see how developers would customize and evolve FRS components, consider 
variant requirements listed in point 4 of section 3, namely viewing reservations by 
facility and reservation date. These are anticipated variant requirements, therefore 
they should be implemented within the components of FRS architecture. Components 
relevant to the viewing reservation requirement include user interface panels for 
selecting reservation viewing options and displaying the results and functional 
components (middle tier) implementing retrieving reservations from the database. All 
the anticipated variant requirements must be implemented within these components. 
A developer will customize components (for example, by setting property values) to 
indicate which reservation viewing methods and which event handlers are needed in a 
custom FRS. After customization, components will reveal only required functions. 

There are two problems with the above method of implementing variant 
requirements. Firstly, if a given variant requirement affects many components, it will 
be up to a developer to keep track of all of these components. This may not be easy 
for large architectures and variations that affect many components. Secondly, 
components will grow in size as all the variants must be implemented within them. 
Customized FRSes will also contain implementation of all the variants, even though 
some of these variants will never be used. 



4.2 Evolving an Architectnre 

Suppose now we wish to evolve the FRS architecture to accommodate a new 
requirement that had not been anticipated before. For this, assume we want certain 
FRSes to produce a list of all reservations for a specified user. As we think that many 
FRSes will need this feature in the future, we wish to include the new requirement 
into the FRS architecture for future reuse. 

The following plan leads to implementing this new variant requirement: the 
“Reservation Management Panels” component in the top tier (Fig. 1) provides 
functionality for building the menu for viewing reservations. It also displays the 
available retrieval methods for reservations and elicits the user’s choice of retrieval 
method. Based on the user’s choice of retrieval method, this component calls the 
appropriate function from the “Facility Management Logic” component (middle tier) 
that implements the logic for a given retrieval method. Under the list of available 
retrieval methods, we must add an additional choice “view reservations made by 
user”. We also need augment middle tier components to include event handling code 
for the new retrieval method and add implementation of a new function that retrieves 
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reservations from the database into the “Facility Management Logic” component. 
Implementation of the new requirements affects interfaces of some components. In 
particular, we need type declarations for a data structure to retrieve reservations for a 
particular user and interface declarations for operations that will allow the FRS client 
to obtain a list of registered users and reservation data. 

If the source code for the user interface and “Facility Management Logic” 
components is not available, there is no simple way to implement this new 
requirement. We may need to re-implement affected components, as components 
cannot be extended in arbitrary ways without the source code. If the source code for 
relevant components is available, we could use either inheritance or a suitable design 
pattern [12] to create new components that would include functionality for viewing 
reservations by user ID. Any visual environment supports the former solution and 
some support the latter [25]. While this method of addressing new requirements is 
sufficient in the rapid application development situation, it presents certain dangers in 
the context of system families. Over years of evolution, an architecture may be 
affected by many new requirements. Implementation of new requirements will add 
new components (or new versions of old components) to the architecture. As certain 
requirements may appear in different combinations - we may end up with even more 
components. As a result, our architecture may become overly complex and difficult to 
use. 

To illustrate the above problem, let us assume that, over years of evolution, four 
sources of change have affected a component X. These changes might include 
unexpected requirements or changes in computing technology. If these changes were 
independent of each other, then chances are that we shall have four new versions of 
component X in the architecture. If it also happens that the four new requirements are 
optional and may appear in any combination in the systems built based on the 
architecture, then we might have as many as 2'*= 16 versions of component X. A 
configuration management system may be needed to manage versions and valid 
combinations of components. 

Of course, the above scenario is pessimistic and, if we have access to the 
component’s source code, we can reduce the number of versions by re-designing 
component X. Such refinements should be routinely done on the evolving 
architectures, but again we have to face the problem of implementing into 
components lots of optional functionality and having flags to activate or deactivate 
the options. These components, growing in size and complexity, have to be included 
into any system built based on the architecture, independently of whether the options 
are need or not. In long-term, accumulative result of this practice is likely to be 
prohibitive to effective reuse. 



5 A Generative Approach to Development of Component-Based 
Systems 

In this section, we shall describe how we can prevent components from growing in 
size and number. We shall consider a program construction environment with a 
generator that produces component-based custom systems from a generic architecture. 
Further motivation for our construction environment stems from the following 
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argument: Suppose some variant requirement R affects five runtime components. 
These five components may be loosely coupled during runtime. However from the 
program construction and maintenance perspective, these five components are tightly 
coupled by means of the same source of change that affects all of them. To facilitate 
the change, it is important that a construction environment reflects tight coupling 
between these components. Therefore, required properties of a generic architectures 
and components of a construction environment are often different from those of 
runtime architectures and components. 

For clarity, we shall call elements of the runtime architecture components, while 
the design and source code elements of a generic architecture in a construction 
environment - construction units. 



5.1 Guidelines for a Construction Environment for Evolving Component- 
Based Systems 

Based on the previously discussed problems, we propose the following guidelines for 
a construction environment to support customization and evolution of component- 
based systems: 

1. An explicit domain model and architecture constraints. The domain model should 
clarify common and variant requirements to be supported. Architectural constraints 
referring to the component structure of runtime systems (as well as other runtime 
concerns) should be also described. 

2. Levels of customization. Customization of a generic architecture to meet variant 
requirements should be done at the architecture level first (for example, by 
selecting architecture construction units affected by variants, modifying interfaces 
of relevant runtime components, etc.) and at the code level next. A variation at the 
architecture level may be equivalent to many variations at the code level. 
Therefore, levels of abstraction simplify the mapping between variant requirements 
and a generic architecture. 

3. Composition of construction units of a generic architecture. Like in aspect- 
oriented programming [16], frame technology [2] or Boca [10], we should be able 
to define different aspects of a system (e.g., business logic, platform- specific logic 
for event handling, optimization techniques, etc.) in separate construction units. 
Also, processing elements relevant to different sources of changes (e.g., variant 
requirements) should be defined separately from each other. A composition 
operation should be provided to produce custom runtime components that have 
required properties from relevant construction units. The actual form of a 
composition operation will depend on the nature of construction units. 

4. Focusing on customization process rather than on customized components. We 
need specify customizations separately from affected construction units and 
components. Composition operation should be automated so that routine 
customizations can be performed automatically. This requirement calls for using 
generative techniques. A generator will read specifications of required variants, 
customize affected components and assemble them into a custom system. This 
arrangement will allow us to keep the number of construction units and 
components of a runtime architecture under control. Customization process can be 
repeated whenever required. Also, the knowledge of how variant requirements 
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affect the architecture will not be lost. By studying the customization process for 
anticipated variant requirements, developers will better understand how to evolve a 
generic architecture by implementing new unexpected requirements into it. 

5. Human-oriented construction environment. Customization and evolution of a 
generic architecture and its construction units cannot be fully automated. A generic 
architecture should be understandable to developers. Understanding the impact of 
variant requirements on the architecture is particularly important. 

In selecting a technology for a construction environment, we kept in mind that a 
generator should be able to cope with both anticipated and unexpected changes. There 
is a wide spectrum of generative techniques but independently of the specific 
approach, a generator is always implemented on top of a generic architecture. Some 
generators (for example, frame processor [2]) make their generic architectures open to 
developers, while others (for example, compiler-compilers) allow developers to 
manipulate the architecture indirectly through a meta-language. Revealing an 
underlying architecture through a specification window is elegant but may limit the 
ways we can deal with new unexpected changes. This may impede architecture 
evolution. Therefore, we found an open architecture generator more appropriate for 
the problem we are dealing with in this paper. 



5.2 A Construction Environment for Component-Based Systems 

Fig. 2 outlines the logical structure and main elements of a solution that complies to 
the guidelines listed in the last section. We shall first describe our solution in general 
terms and then explain how we implemented it using frame technology [2]. A 
construction environment includes a generic architecture and two global structures, 
namely, a domain model and a Customization Decision Tree (CDT). A generic 
architecture is a hierarchy of construction units from which custom components of a 
runtime system are produced. The double arrow in Fig. 2 represents processing that is 
required to produce runtime system components from customized construction units 
of a generic architecture. This includes selection and customization of construction 
units, generation of custom components, assembling components into a custom 
system and testing. 

The role of a domain model and CDT is to facilitate understanding of variations in 
a domain, understanding of how variations are reflected in a generic architecture and 
what customizations are required to incorporate variations into components. 

The reader may refer to [8] for a detailed description techniques we used to model the 
variant requirements in the FRS domain. A Customization Decision Tree (CDT) helps 
understand customizations that lead to satisfying variant requirements. A 
“customization option” refers to a particular way to customize the generic 
architecture. A CDT is a hierarchical organization of customization options. At each 
level of a CDT, there may be either conceptual groupings of customization options or 
customization options themselves. The hierarchical organization facilitates the visual 
location of a particular customization option. Dotted lines in Fig. 2 link variant 
requirements to a corresponding customization option. For each customization option 
in a CDT, there is a customization script that specifies a chain of modifications to the 
construction units of a generic architecture in order to meet the corresponding variant 
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requirement. Scripts are in machine-readable form and are executed by the generator. 
Therefore, customizations to accommodate anticipated variant requirements are 
automated, can be repeated on demand, but still may be subject of human analysis and 
modification. This is important when developers need extend a generic architecture 
with a new unexpected requirement. Developers start by inspecting other similar 
requirements under the related option in a CDT (there is always one at some level of a 
CDT) to find out how they are implemented. Developers should at least obtain certain 
clues as to how implement a new requirement consistently with a generic architecture 
rationale and structure. Once a customization script for a new requirement is written, 
a CDT is extended to reflect a new requirement. 

As indicated in Fig. 3, a CDT may describe a wide range of choices that occur at 
construction-time or runtime, and refer to both functional and non-functional 
requirements, runtime architecture for target systems, platforms, etc. 

There are many technical scenarios to realize the above concept of a construction 
environment. We believe Fig. 2 reflects fundamental elements that appear in any 
software evolution situation. Of course, these elements will appear in different, not 
necessarily explicit, forms, depending on a software development method and 
technology used. For example, in most of the companies there is no explicit domain 
model, links between variations and related customizations remain undocumented and 
we have a manual customization process instead of a generator. Most often, the 
emphasis is on management of already customized components. The actual 
representation of a generic architecture may also range from program files 
instrumented with conditional compilation commands to Object-Oriented frameworks 
and component frameworks, just to mention a few possibilities. 

In our project, we used frame technology [2] (reviewed in section 2) to implement 
a generator-based construction environment for the FRS family. We designed a 
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Fig. 3. Partial Customization Decision Tree with Annotations 



generic architecture as a hierarchy of frames. The construction units of the generic 
FRS architecture directly correspond to runtime components depicted in Fig. 1 
(generally, this need not be the case). Construction units of the generic FRS 
architecture are framed Java applications and applets and framed IDL' specification 
of component interfaces. We “framed” both component code and component 
interfaces. A frame includes breakpoints at which a frame can be customized to 
accommodate variant requirements. The specification frame (the topmost circle on the 
right hand side of the Fig. 2) specifies all the customizations needed to incorporate 
required variations into a custom system. A generator (frame processor, in this case) 
propagates changes to the frames below. The above arrangement complies with 
guidelines 3 and 4: customizations are specified separately from construction units 
and can be executed automatically when needed. The reader can find a detailed 
description of frame-based architecture for FRS family in [9] . 



5.3 Addressing Variant Requirements 

As before, we shall consider reservation viewing options (variant requirement 4 of 
section 3). The customization proceeds as follows: We inspect the CDT (Fig. 3) in 
top-down manner, trying to locate a customization option that describes a variant 
requirement we wish to include into the FRS. We notice customization options 
corresponding to variant requirements retrieval ReserveBy Facility and ReserveByDate 
under the “Retrieval Method” grouping. For each of these options, we find a 
customization script that specifies a chain of modifications to generic FRS 
architecture components that are required to address a corresponding variant 
requirement. We check to see if requirements retrieval ReserveByFacility and 



' IDL is a trademark of Object Management Group 
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ReserveByDate are not in conflict with other variant requirements we wish to address. 
If not, we use the proper customization script to customize the generic FRS 
architecture. 

A generator of a construction environment automates routine customizations. The 
developer is not concerned with customizations triggered by anticipated variant 
requirements: they are indicated in the CDT and are automatically executed by the 
generator. The developer does not have to remember which components and 
component interfaces must be modified and how to implement modifications 
consistently across different components. On contrary, if runtime components is all 
we have, the developer must be concerned with all those issues. Finally, generated 
runtime components of the customized FRS will contain implementation of only those 
variant requirements that are really required. 



5.4 Evolving an Architecture 

With a construction environment, we implement a new requirement to view 
reservations by user ID in the following way. We start by inspecting the CDT (Fig. 3) 
to determine if a customization option for this requirement exists. We find the 
“Retrieval Method” grouping, but currently there is no customization option for the 
requirement to view reservations by user ID. To implement the new requirement, we 
inspect the two existing customization options under the “Retrieval Method” 
grouping, namely retrieval ReserveByFacility and ReserveByDate. For each of these 
customization options, we find (at the CDT node) a script that describes a chain of 
modifications required to implement a corresponding requirement. We study these 
scripts to understand which construction units are selected and customized. Then, we 
proceed to specifying modifications required to implement the new requirement. 
Finally, we add the customization option for new requirement ReserveByUser under 
the “Retrieval Method” grouping. 

With the construction environment, evolution of the generic architecture is done in 
a systematic way and does not unnecessarily complicate the structure of a generic 
architecture. Rather than an end product of customization, we record the 
customization process itself, within customization scripts organized around the CDT. 
This record shows the trace of modifications triggered by a given requirement and 
reveals the rationale for modifications. The architecture becomes a library of 
organized and documented variant requirements together with specifications of how 
to include different variant requirements into the target product. 



6 Conclusions 

In the paper, we discussed advantages that generative techniques offer to component- 
based systems. In component-based systems, components tend to grow in size and 
number as more and more new requirements are implemented into them. We analyzed 
and illustrated with examples the mechanism that produces this unwanted effect. 
Then, we described a general scenario of how the problem can be alleviated with an 
aid of generative techniques. Finally, we described a construction environment, based 
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on a commercial generator of frame technology [2], that we built to support a family 
of component-based Facility Reservation Systems (FRS). Our construction 
environment contains a generic architecture that is open to developers and a 
Customization Decision Tree (CDT) that guides both developers and the generator in 
customizing components. Nodes in the CDT represent variant requirements. Attached 
to the nodes are customization scripts that specify how to include required variants 
into components of a custom system. The CDT also helps developers evolve a generic 
architecture in case of new unexpected requirements. In the paper, we compared 
development and evolution of component-based systems in two situations: The first 
one reflected capabilities of DCPs (such as EJB™ or ActiveX™) and their visual 
environments. In the second situation, we extended a DCP with a construction 
environment based on a generator. We showed how our construction environment 
aids developers in customizing and evolving a component-based system family. 

The results presented in this paper are based on experimental work that was limited 
in depth and breadth. In our project, we addressed variations in functional 
requirements. It is not clear if system- wide qualities such as performance, reliability 
or fault tolerance can be addressed in the same way. It would be interesting to 
evaluate in detail which computational aspects of a program can be most conveniently 
separated using techniques such as aspect-oriented programming [16], frame 
technology [2] or Object-Oriented techniques. Although we explicitly model 
requirement dependencies in our domain model [8], during customization of a generic 
architecture, we deal with dependent requirements in ad hoc way. This raises the issue 
of how we can avoid (reconcile?) conflicts during customization and how we can 
assure the correctness of the customized product. Currently, transition from a domain 
model to generic and runtime architectures is also rather ad hoc. More experiments 
are needed to formulate guidelines to help developers in this difficult task. We plan to 
address the above open problems in future work. 

We believe the domain engineering process itself is not well understood yet. In our 
FRS project, having described an FRS domain, we came up with a component-based 
runtime architecture for the FRS family and then we developed a generic FRS 
architecture and a CDT. We believe that a systematic methodological framework is 
required to coordinate different activities involved in domain engineering. PuLSE [4] 
proposes such framework based on experiences from many domain engineering 
projects. We are exploring the approach described in this paper as one of strategies 
within a general framework provided by PuLSE. 

Each technology mentioned in this paper contributes a useful solution to a certain 
development problem, but none delivers a complete solution. Problem domain, 
software design, runtime architecture and evolution - all seem to form inter-related 
but different dimensions of software development problem. To effectively deal with 
them, we need specialized methods that fit the purpose. It is a challenge for software 
engineering to make those methods compatible with each other, so that they can help 
developers build better software systems. 
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Abstract. A successful software system evolves over time, but this evolu- 
tion often occurs in an ad-hoc fashion. One approach to structure system ev- 
olution is the concept of software product lines where a core architecture sup- 
ports a variety of application contexts. However, in practice, the high cost 
and high risks of redevelopment as well as the substantial investments made 
to develop the existing systems most often mandate significant leverage of 
the legacy assets. Yet, there is little guidance in the literature on how to tran- 
sition legacy assets into a product line set-up. 

In this paper, we present RE-PLACE, an approach developed to support 
the transition of existing software assets towards a product line architecture 
while taking into account anticipated new system variants. We illustrate this 
approach with its application in an industrial setting. 

Keywords: software product line, architecture recovery, reengineering, re- 
use, domain-specific software architecture 

1 Introduction 

1.1 Problem 

One of the principal characteristics of a successful software product is that, over time, 
it will be reused and adapted for purposes that become increasingly different from the 
product’s original purpose. At the same time, customer, project, and organizational 
pressures make it very difficult to coherently manage the resulting multiple develop- 
ment and maintenance threads. 

Apart from unnecessary duplication of effort, some of the possible consequences 
are structure degradation, un-managed diversity, and little conceptual integrity among 
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the system variants. These adverse consequences often imply additional evolution time 
and cost. This is a widespread situation that affects many software systems today. 

1.2 Approach and Context 

One possible way of addressing the problems mentioned in the previous section is the 
product line concept. A product line is a set of systems sharing core features while var- 
ying in others. It may be operationalized via a domain-specific software architecture, 
which, using a flexible and customizable design, embodies a platform from which sin- 
gle systems can be derived. 

Although most of the product line approaches suggest integrating existing assets 
into the product line reference architecture, there is little guidance on how to achieve 
this. Our strategy was to develop an approach to help organizations transition their ex- 
isting assets to a new, more modern design. This design should support the evolution of 
existing applications and accommodate future applications as far as possible. The result 
is RE-PLACE, an approach detailed and illustrated in this paper. 

RE-PLACE stands for Reengineering-Enabled Product Line Architecture Creation 
and Evolution. RE-PLACE was developed during the RAMSIS kernel redesign project 
to communicate the approach we were taking and to allow its application in different 
contexts. RAMSIS is a large, very successful human modeling system. Its kernel has 
evolved as a monolithic Fortran system over the past 10 years and has grown to a size 
where all maintenance tasks and new developments need considerable time. Therefore, 
it is not very well-suited for the anticipated new uses in various application contexts, 
which makes the kernel redesign necessary. 

1.3 Related Work 

To develop RE-PLACE, we explored both emerging product line concepts and tech- 
niques, and the existing reengineering ideas and methods: product line concepts and 
techniques because they provide us with the basis for thinking in terms of a family of 
systems, and reengineering ideas and methods because they allow us to leverage exist- 
ing assets. 

There are a number of domain engineering approaches proposed such as Organiza- 
tional Domain Modeling (ODM) [21], Model-Based Software Engineering (MBSE) 
[15], the Domain-Specific Software Architecture (DSSA) program [22], and Synthesis 
[ 20 ]. 

Although the need for reusing existing assets in domain engineering has been rec- 
ognized, there has been little published work in the past that tried to leverage existing 
assets by integrating them into a product line architecture. Existing work [6, 23] in- 
cludes process overviews of possible mechanisms and entry points for integrating reen- 
gineering activities in product line development. 

Reengineering has a rich tradition concerning the recovery of software assets and 
their integration within a new environment. One key reengineering activity in transi- 
tioning a system is the recovery of its architecture. Many approaches to architecture re- 
covery have been proposed [7, 9, 10]. Domain- Augmented Re-Engineering (DARE) [5] 
integrates the domain dimension, showing how domain model construction and reverse 
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engineering can offer a significant synergy effect while understanding applications of a 
specific domain. Whereas RE-PLACE takes a product line perspective, DARE focuses 
on understanding an individual system at a time. A complementary contribution of 
reengineering is the recovery of reusable components [3, 17, 13]. 

There is work on techniques supporting the transformation of existing assets to 
make them reusable in a different context, for example, migration towards object orien- 
tation [1, 12] or wrapping of recovered assets [19]. This work can be exploited in the 
context of transitioning existing legacy assets to a product line architecture. 

1.4 Outline 

The remainder of this paper is structured as follows: Section 2 describes the framework 
of the RE-PLACE approach. Section 3 presents how this framework was instantiated in 
the context of the RAMSIS kernel redesign project, succinctly describing the process 
and techniques applied and some of the results obtained. Section 4 concludes the paper. 

2 The RE-PLACE Framework 

Creating a product line while reusing existing assets requires both product line mode- 
ling and reengineering activities. One of the challenges is the successful combination 
of these activities, which enables smooth interaction between them and allows maximal 
reuse without compromising the product line design. 

RE-PLACE is an effort to combine reengineering and product line activities. It is a 
flexible and customizable approach, expressed as a framework (shown in Figure 1) that 
is instantiated to fulfil the specific needs of a project. 

2.1 Process 

The key idea of RE-PLACE is the use of a blackboard, a shared work space allowing 
the reengineering and product line activities to exchange and incrementally enrich in- 
formation. This blackboard contains the accumulated knowledge from the two activities 
and a coordination workspace used for synchronization purposes. 

The accumulated knowledge consists of evolving work products. The modeling 
products and reengineering assets have their primary source in the product line mode- 
ling and reengineering activities, respectively. Nevertheless, the assets and products 
may be manipulated by both activities. 

The main goals of the reengineering activities are to identify components that can 
be reused in the product line and to propose a way to integrate them into the product 
line. The deliverable produced by these activities is the repackaging and integration 
strategy. 

The product line activities aim at the creation of the product line architecture. The 
first step to accomplish this is the product line modeling. It produces the product line 
model that describes common and variable characteristics of the product line and con- 
tains a decision model. The latter is used to map these variable characteristics to differ- 
ent members of the product line. The second main product line activity is the product 
line architecture design. 
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Figure 1. RE-PLACE Framework 

The final work product, the transition plan, describes the complete process of im- 
plementing the product line architecture while following the repackaging and integra- 
tion strategy to integrate the code reused from the original application. 

2.2 Expected Benefits 

One of the main expected benefits of RE-PLACE is the synergy effect obtained by the 
integration of the reengineering and product line activities. 

Early information flow from one activity to the other can help avoid duplicated 
work and the exploration of impractical alternatives. Furthermore, the information pro- 
vided by one side can be used as a starting point for the activities on the other side. 

For example, early knowledge of potentially reusable software assets can be used 
to direct the design of the product line architecture. This is more efficient than adapting 
either the architecture or the assets when they have evolved separately. Similarly, an 
early version of the product line architecture can guide the development of the repack- 
aging and integration strategy. 

3 The RAMSIS Kernel Redesign Project 



This section illustrates RE-PLACE hy presenting its instantiation for the RAMSIS ker- 
nel redesign project. 

3.1 Context and Goal of the Project 

RAMSIS is a system for the simulation of various aspects of the human body. It pro- 
vides a model of the human body and is able to predict the posture a human would adopt 
given certain circumstances, while taking into account human comfort feeling. RAM- 
SIS contains the statistical distribution of body measures correlated to gender, age, and 
origin. This allows customized human models to be created. The RAMSIS kernel is 
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used as part of a stand-alone ergonomic tool and is also integrated into various compu- 
ter-aided design systems. Whereas, in the past, it has been primarily used for ergonomic 
analysis by the automotive industry, there are plans now to use it in very different ap- 
plication contexts, for example, body measuring for the production of tailor-made 
clothes. 

The current structure of the RAMSIS system, however, makes it difficult to quickly 
adapt the system to these new contexts. As the initial design of the kernel dates back to 
the 1980s, it has evolved over a long period of time, leading to a complex system archi- 
tecture and many dependencies within the kernel that are not fully documented. Apart 
from impeding the further development of the kernel to address new markets, this also 
creates maintenance problems. 

To both allow the evolution of the system and improve the efficiency of mainte- 
nance activities, a complete rewrite of the kernel was considered, but rejected due to 
various reasons: First of all, the excessive amount of work involved would make the 
project prohibitively expensive and also delay the introduction of the new applications, 
which have critical time-to-market requirements. Furthermore, as the application em- 
bodies a substantial amount of experience, there would be a high risk in rewriting the 
complex mathematical functionality present in the existing system. 

Reusing some of the core functionality was considered a promising way of mitigat- 
ing the risk of the kernel redesign while also reducing the cost and the time spent. Fol- 
lowing this idea, the purpose of the RAMSIS kernel redesign project is to replace the 
monolithic, Fortran-based RAMSIS kernel with an extensible, object-oriented kernel 
implemented in C-H- while maximizing reuse of existing assets. 

The main requirements for the project are: 

• The new design should be flexible to allow quick adaptations to new markets. 

• The new design should be more maintainable than the existing kernel. 

• A significant amount of existing kernel code should be reused. 

To satisfy these requirements, the following decisions were taken: The redesigned 
RAMSIS kernel will be used in a product line for the domain of human modeling and 
the existing RAMSIS kernel will be reengineered to reuse the key know-how embodied 
in its code. 

The overall RAMSIS kernel redesign project consists of two major stages. The de- 
liverables of the first stage are: 

• a new object-oriented design for the kernel 

• a set of reusable Fortran components 

• a detailed description of a strategy to integrate the reused components in the new 
object-oriented kernel 

The second stage consists of the implementation of the new kernel according to the new 
design and the integration of the identified reusable components according to the inte- 
gration strategy. This paper describes the state of the project after the first stage. The 
second stage is to be performed by the industrial partner. 

The following stakeholder groups were involved in the first project stage: 

• Domain experts in the area of human modeling 
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• Application experts who were involved with the development or evolution of the 
current RAMSIS system 

• Three developer groups building applications using the RAMSIS kernel (ergo- 
nomic applications, measurement applications, and visualization) 

3.2 Blackboard Instantiation 

In this project, the blackboard was BSCW (Basic Support for Cooperative Work) [14], 
a web-based tool offering a version-controlled, distributed shared workspace that facil- 
itates the exchange of documents. In addition, one team member who participated in 
reengineering, product line modeling, and architecture design, is also a domain expert 
and supported the blackboard, highlighting the relationships among accumulated 
knowledge and decisions taken as the project progressed. 

3.3 Project Process 

The process used for the RAMSIS kernel redesign is shown in Figure 2. It is an instan- 
tiation of the RE-PLACE framework shown in Eigure I . 

The instantiation involves the refinement of work products and activities to support 
the specific project needs. Eor example, in the instantiated process, the repackaging and 
integration strategy consists of two parts due to the specific nature of the project: As 
the programming language used for the final product line (C++) is different from the 
language the existing system is written in (Eortran), the reused code has to be wrapped. 
This triggers the need for a project-specific work product, namely the wrapping scheme. 

The instantiation of the other work products and activities is straightforward, hence 
the instantiated work products are not explicitly described here, but will be discussed in 
detail in the next subsection. 

Throughout the process, open issues and tasks to be done were put on the coordina- 
tion workspace residing on the blackboard. In addition, there were joint meetings to dis- 
cuss open issues and to resolve inconsistencies. 

The product line approach used for designing the RAMSIS kernel product line is 
PuLSE^M [2], so the instantiations of the product line modeling and product line archi- 
tecture design steps are the corresponding components of PuLSE™. PuLSE^m (Product 
Line Software Engineering) is a method to enable the conception and deployment of 
software product lines within a large variety of enterprise contexts. This is achieved via 
a product-centric focus throughout its phases, customizability of its components, an in- 
cremental introduction capability, a maturity scale for structured evolution, and adapta- 
tions to a few main product development situations. 

Although the process used in the RAMSIS project is closely tied to the way 
PuLSE^M works, RE-PEACE could also be instantiated with other product line ap- 
proaches as long as they support the concepts of product line modeling and architecture 
design phases. 

Most reengineering activities were carried out using Refine^M [ig]. This was used 
both as a reverse engineering environment (Refine/Eortrani^^) and as a programming 
language for the implementation of our analysis tools. Rigi [16] was used to visualize 
the information obtained by applying the analysis tools. 
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Figure 2. Instantiated RE-PLACE Process for 
the RAMSIS Kernel Redesign Project 



3.4 Project Process Enactment 

The project enactment can be subdivided into three major phases; In the first phase, 
Identification, information about the existing system and characteristics of the envi- 
sioned product line are gathered. In the second phase. Modeling, the reuse of recovered 
components is prepared and the specification of the product line architecture is created. 
In the final phase. Building, the redesigned kernel architecture and the transition plan 
are built. The rest of this subsection describes the three phases in detail. 
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I. Identification 
Current System Architecture 

Recovering the architecture of the current system is the first reengineering activity. The 
architectural description of the current system constitutes an overview of the logical 
system structure. For the RAMSIS system, a high-level view of the architecture was 
captured via the source file organization structure and function naming conventions, 
which identify the various kernel interfaces. This overview, presented using Rigi [16], 
offers support for communication with domain experts, application experts, and devel- 
oper groups. 

Existing Components 

The components that are interesting to recover are cohesive computation units with a 
clear interface implementing an aspect of the functionality of the system. 

The approach used in this project, dominance-driven functionality analysis, is an 
extension of Cimitile’s approach [4], to which we added various heuristics. It entails 
producing a dominance tree from the call graph of the system and selecting relevant 
subtrees. Each subtree in the dominance tree corresponds to a group of functions that 
are called only from within the subtree (except for the root, which can be called from 
other parts of the system). This effectively means that the root offers some ‘service’ to 
the rest of the system. Further heuristics based on the recovered kernel architecture (in 
particular, the external kernel interface) are then used to guide the actual selection of 
subtrees. 

For RAMSIS, this automated approach produced 250 components at various levels 
of abstraction. Some of them correspond to system features such as “estimating the belt 
position for the human model given its position in a car seat" or “computing the space 
reachable by the human model’’. Others correspond to support services, for example, 
the “optimizer”, a numerical approach supporting many of the analyses in the kernel. 
The recovered components were put on the blackboard as an initial version of potential- 
ly reusable components. 

Existing Features 

The features that are provided by the current RAMSIS kernel constitute a first set of re- 
quirements for the RAMSIS product line. Furthermore, a description of the system in 
terms of its features makes it easier for the stakeholders to communicate both with us 
and among themselves. Since there are no maintained requirements documents that 
capture the current status of the existing features present in RAMSIS, we had to elicit 
this information. 

The existing features are identified by two overlapping activities that are explained 
in the following two paragraphs: 

Mapping components to features. In this activity, a domain expert tries to map 
the components identified by the reengineering side to features of the system. This is 
both needed to prepare the reuse of the components and to find a first set of features of 
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the existing system. The features that have been found this way are collected in a list 
that is put on the blackboard as existing features and that is continuously expanded. 

Elicitation of existing features. While the first activity is still ongoing, the appli- 
cation experts and developer groups are asked to describe the existing RAMSIS system 
in terms of its abstract high-level features. The experts can use the initial version of the 
list of existing features on the blackboard to guide them, and they update this list as they 
find new features that have not been identified before. 

Each activity benefits from the work that is performed within the other activity: While 
mapping the components to features, features may be found that have been overlooked 
by the application experts in the elicitation process, and these features may even guide 
them towards related features that have been missed as well. New features discovered 
by the application experts facilitate the mapping of components and can also drive the 
reengineering side to look for components that implement those features. 

The final result of these activities were the 39 features presented in Figure 3. Each 
of them was described in a glossary to avoid misunderstandings. The potentially reus- 
able components on the blackboard were then annotated with the features to which they 
correspond. 
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Figure 3. Identified Existing Features 



Required Features 

In addition to the existing features, anticipated applications and developer needs are an- 
other source of requirements for the RAMSIS product line. The requirements for future 
applications are taken into account in the product line modeling and design to ensure 
stability and flexibility of the resulting product line architecture. 

In a number of interviews with the developer groups, we collected a list of antici- 
pated applications for the RAMSIS kernel. Later, in collaboration with the domain ex- 
perts, we examined these applications for the requirements they impose on the RAMSIS 
kernel. Some of the future applications were grouped and their requirements were col- 
lapsed, because the required kernel features were similar. The result of this step was a 
list of eight anticipated applications (work environment design, shoe design, medical 
applications, visualization of external measurement data, virtual reality, furniture de- 
sign, crash simulation, CAD system integration) together with information about re- 
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quired kernel features (e.g., for shoe design it is necessary to have a more detailed rep- 
resentation of the human foot than the one available in the existing system). 

Additionally, we collected the different developer groups’ needs. These are features 
that are considered useful for several applications and that should therefore be taken 
into account for the RAMSIS product line. The required features collected in both steps 
are shown in Figure 4. 
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Figure 4. New Features 



Task Scenarios 

Task scenarios are collected throughout the project. Their collection started in the iden- 
tification phase to understand the various applications and their features. 

Usually, task scenarios are used to capture the range of usage of a software product 
and the changes it might undergo over time (task scenarios were introduced in SAAM, 
the Software Architecture Analysis Method [11]). Additionally, in the RAMSIS 
project, we use task scenarios to articulate and record design decisions. 

Whenever design alternatives are discussed or decisions are taken, we try to capture 
the rationale behind the alternative or decision as a task scenario. Furthermore, we use 
task scenarios to describe the import aspects of the way the RAMSIS product line will 
be used and modified. Thus, task scenarios play a crucial role in the product line engi- 
neering part of the RAMSIS project. 

As an example for deriving task scenarios, consider the following: One of the an- 
ticipated future applications is the use of the RAMSIS human model in a virtual reality 
environment. As, in such an environment, it should be possible for the human model to 
interact with external objects, which might themselves be other human models, the fol- 
lowing task scenario was derived: “Posture prediction on some human model interact- 
ing with some other human model might require a combined analysis.” 

II. Modeling 



Product Map 

The first work product of the Modeling Phase is the product map. A product map dis- 
plays the relationships between features and the members of the product line in a two- 
dimensional matrix. 

The first step in creating the product map was to group and layer the features we 
gathered in the Identification Phase. The grouping serves as a basis for the structuring 
of the features and the layering expresses dependencies among the features. As a con- 
sequence, the layering provides information about the product line scope. 

We asked the domain experts to group and layer the collected features. We then pre- 
sented the results to the three stakeholder groups and asked them to re-arrange the fea- 
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Figure 5. Simplified Product Map 



tures if needed. They were also invited to add further features they might discover while 
considering the grouping and layering. Both the layering and the grouping was basically 
accepted by the groups. The pre-ordering facilitated the understanding and the use of 
the information presented. 

Based on the meeting sessions, we created a view of the grouped and layered fea- 
tures. The three main layers identified were: 

• Common Core: Definition of the model structure. The common core provides 
functionality to define, instantiate and access the structure of the human model. 

• Kernel: This layer provides functionality common to more than one application. 

• Application-specific layer: contains functionality that is not shared by different 
applications. 

The grouped features and their layering are depicted in Figure 8 on page 16. The 
grouped and layered features served as a basis for the creation of a product map. They 
are listed on the horizontal axis of the product map and related to the applications cov- 
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ered by the product line that are listed on the vertical axis. Figure 5 shows a simplified 
version of the product map we created for the RAMSIS product line. 

The task scenarios that had been collected at that point in time were applied to the 
product map to evaluate its completeness and consistency. 

Decision Model 

Decision models are used in product line development to tailor generic product line ar- 
tifacts to a specific system that is a member of the product line. The explicit capturing 
of variability in product line artifacts plays an important role in describing the differ- 
ences among the product line members. To describe a single member of the product 
line, the variabilities in the product line artifacts have to be resolved. Decision models 
are a mechanism to accomplish that goal. 

A decision model is a structured set of unresolved decisions together with possible 
choices that, when applied to the appropriate product line artifact, enable the specifica- 
tion of a member (or a group of members) of the product line by resolving the decisions. 

In the context of the RAMSIS product line, the product map serves as a simple de- 
cision model. It relates the high level requirements to the features. For example, a kernel 
for a virtual reality application would have to contain all feature groups marked as com- 
pletely required in Figure 5. Out of those that are marked partially required, the decision 
model states that the following features are needed: “Analyses” 10, 23, 26, 27, and 52; 
“High-level manipulations” 11, 12, 13, 14 and “User File” 17 (numbers as in Figures 3 
and 4). The feature groups “Alternative Skin” and “Rule-Inference Engine” can be add- 
ed to the kernel optionally. 

The Wrapping Scheme 

The contribution of reengineering to the Modeling Phase is the wrapping scheme, a 
work product that describes how reused components are integrated into the new C-H- 
kernel while hiding the fact that they contain Fortran code. 

The actual design of the wrapping scheme was based on the design decisions that 
were taken up to this point. One of them was to completely redesign the human model 
in an extensible, object-oriented way using C-H-. In the old kernel, the human model 
was stored in Fortran variables, so the main problem to be solved when integrating re- 
used components into the new kernel was to make sure that the old Fortran code would 
work in spite of the new representation of the human model. 

The idea of the wrapping is to provide C-H- call interfaces for the reused Fortran 
components that would allow specific actions to be taken before and after calling the 
actual Fortran routine in order to map the new object-oriented human model to the data 
format the wrapped Fortran routines expect. A Fortran component with such an inter- 
face is called a wrapped object (WO). Before entering a WO from C-H-, the state of the 
human model in C-H- is transferred to the corresponding Fortran variables. Similarly, 
when returning from a call to Fortran, the C-H- state is updated to reflect the changes 
made during the call. 

The transfer of data from the C-H- human model to the Fortran data and vice versa 
is performed by a consistency manager that knows which data is accessed by each of 
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Figure 6. The Wrapping Scheme Used for RAMSIS 

the WOs and that also keeps track of which data has already been transferred to the For- 
tran data block in order to avoid unnecessary transfer of data. 

The wrapping scheme was designed in an iterative manner making use of the col- 
lected task scenarios. The final version contains detailed guidelines for the design of the 
required call interfaces and the design of the consistency manager. A prototypical im- 
plementation of the consistency manager was built to run performance estimation tests. 

III. Building 



Wrap/rewrite proposal 

Not all of the components in the list of potentially reusable components accumulated 
on the blackboard can be sensibly reused. For example, it is often not a good choice to 
reuse a component that is simple and operates directly and exclusively on data struc- 
tures that will be redesigned. In this case, it is preferable to rewrite the component. 

The process of evaluating and selecting components for wrapping is based on ex- 
pert opinion taking into account the estimated difficulty, risk and effort associated with 
rewriting the component from scratch, the estimated impact of wrapping the component 
on the system performance, and the coupling of the component to data structures and 
routines that are to be rewritten. 

Most decisions involve the consideration of trade-offs, for example balancing the 
performance loss due to the wrapping against the risk and effort of rewriting a compli- 
cated mathematical function. To support these trade-off decisions, we developed a sim- 
ulation benchmark to obtain performance estimations. Additionally, we provided size 
and complexity measures. 

In case of doubt, the default strategy was to prefer wrapping to rewriting in order to 
obtain a new kernel as early as possible and then gradually migrate these components. 
In the end, the decision relies mainly on the judgment of an expert in accordance with 
the design decisions already made. The decisions for all components are recorded in the 
wrap/rewrite proposal. Out of the 250 potentially reusable components, 112 were se- 
lected for reuse. 
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Product Line Architecture 

We use an iterative process (an adaptation of the PuLSE^^-DSSA component for ref- 
erence architecture creation) to design the product line architecture. The features taken 
from the product map and the task scenarios are inputs to this process. Additionally, the 
current kernel architecture, the wrap/rewrite proposal, and the candidates for reuse are 
consulted. 

The current kernel architecture serves as a source of information about the compo- 
nents in the current system, their interfaces, and the way they interact. Another impor- 
tant aspect is the current structure of the human model because it has to be possible to 
migrate models from the current RAMSIS to the RAMSIS product line. The candidates 
for reuse have to be components in the new architecture and their interfaces must be de- 
signed according to their description in the wrap/rewrite proposal. 

Architecture development started with a basic set of features taken from the product 
map. They were used to create an initial set of components and their interconnections. 
Appropriate task scenarios are used to evaluate the product line architecture according 
to SAAM [1 1]. In the next iteration of the product line architecture development, a new 
set of features is chosen. This set is then applied to the current architecture to refine it. 
This refinement can be done either by adding components or by refining existing com- 
ponents. The process can be stopped when all necessary features have been taken into 
account. 

For the RAMSIS kernel redesign, we started with the design of the common core. 
The structural composition of human models is called ‘inner model’ in RAMSIS. Tak- 
ing into consideration the current definition of the inner model, we designed a new hi- 
erarchical model structure based on the Composite Pattern [8]. The new design allows 
the uniform handling of simple and compound body parts, as well as altering the model 
structure. Using the classes that represent the pattern (abstract, simple, and compound 
body part, see Figure 7), we created a basic model structure by defining compound body 
parts and their contained simple body parts. 

The inner model is independent of the information needed to render it. This prop- 
erty is true for all model elements and allows the use of different rendering methods for 
the same model element. 

Figure 7 shows a class diagram (simplified at the request of our customer) of the 
product line architecture design. 

In the next iteration, we handled the feature of ‘attaching external objects’ . The dif- 
ferent user groups need to attach different objects to the inner model to model different 
realizations of skins, bones, muscles, and body-external objects (e.g. shoes, clothes, 
glasses, etc.). Based on the Observer Pattern [8], we modeled the attachment of objects. 
The purpose of the observer pattern is the definition of a sender - listener dependency 
among objects at run-time. Fach attached object is registered as listener to a set of 
changes affecting the object it is attached to. This mechanism also allows multiple at- 
tachments (e.g. inner model <— bones <— muscles <— skin). Listeners are notified when- 
ever attributes they are registered to change. 

The third iteration was concerned with a general skin model and the design of a first 
implementation of that general model, a point-based skin as currently used in RAMSIS. 
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Figure 7. Simplified Class Diagram of the Kernel Design 



Another representation of the skin may be modeled using B-splines. The generic inter- 
face of the general skin model provides means for easily exchanging skin models and 
their representations. 

The next iteration of the design integrated the high-level analyses and manipula- 
tions. A uniform interface for those features allows the usage of newly written C-H- 
code or wrapped objects 

Integration strategy 

The integration strategy describes how to apply the wrapping scheme to the candidates 
selected for reuse. It extends the wrapping scheme with an explicit description of the 
transformations to be applied to the source files in order to wrap them. Both the results 
of the previous reengineering steps {wrapping scheme and wrap-rewrite proposal) and 
the new product line architecture are needed to design this work product. 

For RAMSIS, the integration strategy contains a description of a partly automated 
process to analyze and wrap the selected candidates, and a fully automated process to 
redo the wrapping after maintenance of the Fortran code. 

Transition plan 

The transition plan for the RAMSIS kernel redesign project consists of the product line 
architecture, the integration strategy, the wrapping scheme, effort estimates, and a 
schedule for the transition to the new kernel. It also includes recommendations concern- 
ing the maintenance and evolution of the wrapped Fortran in the new kernel and its mi- 
gration toward C-H-. 
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Figure 8. Feature View of RAMSIS 

3.5 Major Impacts of RE-PLACE 

The impact of RE-PLACE, both on the reuse in the product line of kernels and on the 
time saved in producing a new kernel, can already be perceived. 

Figure 8 presents the grouped and layered features together with information on 
their expected reusability. Out of the 45 high-level kernel features, 1 1 are new ones that 
have to be written. The 22 features that have to be rewritten are mainly concerned with 
the human model. They represent 18% of the original kernel (in lines of code). The re- 
maining 12 features, which are implemented by 49% of the original kernel, will be 
wrapped and reused. Most of them correspond to complex mathematical analyses. 

Three years ago, our customer estimated that the creation of a new object-oriented 
kernel for RAMSIS would require 10 man-years. As a result of the first project stage 
with an effort of 4 man-years, it is estimated that the second stage, i.e. the transition to 
the new object-oriented kernel, will require less than 2 man-years of effort. This means 
that the RE-PLACE-based approach requires less than 6 man-years in comparison to 10 
man-years for a complete rewrite. It should also be considered that the 4 man-years in- 
clude the definition of RE-PLACE, the exploration of new techniques, and the devel- 
opment of the infrastructure. Furthermore, future projects based on RE-PLACE could 
be performed more efficiently, because they can integrate some of the lessons learned. 

3.6 Some Lessons Learned 

The lessons learned during the project include: 

• It is important to have an early consensus on the structure of the blackboard, the 
types of documents which are put there, and the guidelines concerning who 
should be informed about changes to specific parts of the blackboard. 
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• With multiple groups working in parallel and focusing on different aspects, con- 
trolling the vocabulary in use becomes an important issue. Without a glossary the 
vocabulary can easily diverge. 

• At the beginning of a project, it is difficult to fully instantiate the framework. The 
need for new work products will emerge as more is understood about the product 
line and the existing system. 

• The reengineering techniques used in a project depend heavily on the exact type 
of repackaging and integration needed, as well as on the programming languages 
used by the existing and future product line applications. 

• It is important to adapt the methods and notations to the customer’s culture in or- 
der to facilitate collaboration with experts and acceptance of the new product line 
architecture. 

4 Conclusion 

In this paper, we presented RE-PLACE, a framework for the integration of reengineer- 
ing and product line activities to support the transition of existing assets into a product 
line architecture. The applicability of the framework was shown by presenting how it 
was used in the RAMSIS kernel redesign project. The experience made so far confirms 
the initial ideas that led to the development of RE-PLACE: 

• Synergy: The integration of reengineering and product line activities was well 
supported by the blackboard mechanism. The existence of joint work products on 
the blackboard led to significant synergy effects. 

• Global view: Early information flow from one activity to the other enhanced our 
understanding of the project and allowed each group to take a more global per- 
spective. 

• Sound separation of activities: The decomposition of the RE-PLACE process into 
two concurrent sets of activities, loosely coupled through the blackboard, allowed 
each of the two groups to focus on its activities while still having access to the 
information required. This allowed a high degree of parallelism. 

The initial experience and results of applying RE-PLACE encourage us to do further 
research. This involves applying RE-PLACE in other contexts as well as investigating 
possibilities for further tool support. 
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Abstract. We introduce CHIME, the Columbia Hypermedia IMmersion En- 
vironment, a metadata-based information environment, and describe its po- 
tential applications for internet and intranet-based distributed software devel- 
opment. CHIME derives many of its concepts from Multi-User Domains 
(MUDs), placing users in a semi-automatically generated 3D virtual world 
representing the software system. Users interact with project artifacts by 
"walking around" the virtual world, where they potentially encounter and 
collaborate with other users' avatars. CHIME aims to support large software 
development projects, in which team members are often geographically and 
temporally dispersed, through novel use of virtual environment technology. 

We describe the mechanisms through which CHIME worlds are populated 
with project artifacts, as well as our initial experiments with CHIME and our 
future goals for the system. 

1 Introduction 

Software development projects typically involve much more than just source code. 
Even small- to medium-sized development efforts may involve hundreds of artifacts — 
design documents, change requests, test cases and results, code review documents, and 
related documentation. Significant resources are poured into the creation of these arti- 
facts, which make up an important part of an organizations' "corporate memory." Yet 
it can be quite difficult for new project members to come up to speed — and thus become 
productive contributors to the development effort. 

The user interface research community has, in recent years, paid much attention to the 
development of techniques to help users assimilate a broad range of related information 
easily, through the development of 3D-based information visualizations (see [1] for ex- 
ample). These techniques excel at putting large volumes of related information in con- 
text for an unfamiliar user, as well as making it easier for more experienced users to find 
particular information they are looking for. See [2] for a description of many experi- 
ments and results in this area. 
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Hypertext is another technique which has been widely recognized as being useful for 
contextually relating documents and other artifacts[3]. By placing links among related 
items, experts can leave trails through the information space. Later users can follow a 
"trail" of hypertext links which lead them to many related documents. In doing so, a 
user may, for instance, gain important insight into how various components of a soft- 
ware system are interrelated. Links may be automatically generated by tools as well 
(see [4]). 

CHIME, the Columbia Hypermedia IMmersion Environment, is a framework that aims 
to synthesize results from both these research communities to create a software devel- 
opment environment for managing and organizing information from all phases of the 
software lifecycle. CHIME is designed around an XML-based metadata architecture, 
in which the software artifacts continue to reside in their original locations. Source 
code may be located in a configuration management system, design documents in a doc- 
ument management system, email archives on a corporate intranet site, etc. Thus 
CHIME does not handle storage of artifacts — users continue to use their existing tools 
to access this data while CHIME provides data organization and hypertext linking ca- 
pabilities. CHIME maintains only location and access information for artifacts in the 
form of metadata. CHIME uses an extensible Virtual Environment Model and dynamic 
theme component (described below) to generate a Multi-User Domain style virtual 
world from the metadata. The virtual world may take the form of a 3D immersive vir- 
tual reality (as in many contemporary games like "Quake" from Id Software) or a simple 
text world (as in the original "Adventure" and "Zork" games from the 1970's and 
1980's). 

In the CHIME virtual world, users interact with project artifacts by "walking around," 
where they potentially encounter other users' representations (avatars). While inciden- 
tal encounters add a sense of realism to a virtual world, a potentially more useful appli- 
cation of this technology (which CHIME supports) is to allow a novice user to collab- 
orate with an expert by finding his avatar and beginning a conversation. Geographically 
and temporally dispersed users thus easily gain context with work being performed 
elsewhere. 

This paper proceeds as follows: in the next section, we discuss the model which under- 
lies CHIME, followed by a discussion of our current architecture. Next, a description 
of a recent experiment in which we used CHIME to build a 3D environment from the 
Linux operating system kernel, documentation, and email archives. Following this, we 
describe related work in a variety of domains, including Software Development Envi- 
ronments (SDEs), Software Visualization, Information Visualization, and Metadata ar- 
chitectures. Finally, we discuss the contributions of this work and some future direc- 
tions. 

2 Model 

The conceptual model underlying CHIME is comprised of three main components: 
Groupspaces, Groupviews, and Software Immersion. In this section, we describe these 
components and their relationships. 
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We use the term Groupspace to describe a persistent collaborative virtual space in 
which participants work. The participants may be geographically or temporally distrib- 
uted, and they may be from different organizations cooperating on a common project 
(subcontractors on a defense contract, for example). Contained within the groupspace 
are project artifacts as well as the tools used to create, modify, and maintain them. Ar- 
tifacts may be organized and re-organized at will by project participants. 

Central to the Groupspace concept is the idea that project artifacts continue to exist in 
their original form in their original repositories. This differs from traditional Software 
Development Environments (SDEs) (like Oz[5], SM1LE[6], Desert[7], Sun NSE[8], 
Microsoft Visual C-h-[ 9]), as well as most traditional Groupware systems (like 
eRoom[10], TeamWave[ll], and Orbit[12]) in which artifacts are under the strict con- 
trol of the environment. In these systems, users are expected to access artifacts only 
through the development environment's cadre of tools or via COTS tools specially 
"wrapped" to work with the environment. In a Groupspace, artifacts continue to exist 
in their legacy databases, configuration management systems, bug tracking systems, ra- 
tionale capture tools, etc. 

Additionally, Groupspaces may contain information generated within the space. A par- 
ticular Groupspace may contain built-in tools and services to be used by participants, 
e.g. to add arbitrary annotations to particular artifacts, hold real-time chat sessions, add 
hypertext links on top of (and separate from) artifacts in the system, semi-automatically 
propagate knowledge among participants (in the manner of a recommender system 

[13] ), etc. 

We use the term Groupviews to describe multiuser, scalable user interfaces used to nav- 
igate and work in a Groupspace. In addition to allowing Groupspace participants to find 
and access relevant information quickly (as they might in a single user system, or a sys- 
tem in which they had no knowledge of other users' actions), Groupviews keep users 
informed about work being performed by fellow users. 

Groupviews build on research and commercial work in Multi-User Domains (MUDs) 

[14] , chat systems [15], virtual environments [16], and 3d immersive games [17]. In a 
Groupview, a set of virtual environment rooms containing project artifacts is generated 
from the organization of the artifacts in the Groupspace. Rather than placing artifacts 
into these rooms arbitrarily or according to some external mapping mechanism (as in 
Promo[26], where the mapping from artifacts to rooms is created from a software proc- 
ess definition and cannot be modified by users without corresponding modification to 
the process), a Groupview generates the rooms and connections between the rooms 
from the artifacts themselves. Eor example, a software module might become a room 
in the Groupview, and the source files making up the module might be furnishings in- 
side the room. Corridors might link the modules' room with rooms containing design 
documentation, test reports, and other artifacts related to the code. 

A core aspect of Groupviews is the ability to provide selective awareness of other users' 
actions. Participants' locations in the virtual environment, as well as their scope of in- 
terest (i.e. the project or projects they are currently involved in, portions of the system 
they are considered "expert" in, related documents they have recently read, written, or 
modified, etc.) are shared among other users. In the case of a Groupview involving 
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multiple teams working on separate (but interrelated) portions of a project, it should be 
possible for users to "tune" awareness so they receive only information relevant to them 
and their work. 

It is important to note that our discussion of the Groupview model does not specify that 
they must be built as graphical or 3D "virtual reality" style user interfaces. As the very 
successful IRC system [15] and the literally thousands of text-based MUDs available 
on the internet have shown, 3D graphics are not necessarily required to provide an im- 
mersive experience to users. As shown in [32], users utilizing immersive environments 
for real work can get quite involved with a simple text-based user interface, even to the 
extent of ignoring their family life in favor of their virtual one. 

In a Software Immersion, the third component of the conceptual model underpinning 
CHIME, team members collaborate and perform individual tasks in a virtual space de- 
fined by the structure of the artifacts making up the software system. This builds in 
some respects on previous work done in the Software Visualization community (see 
[18]) in which visualizations of module interactions and interrelations are created. The 
primary difference here is that a Software Immersion is intended to be built semi-auto- 
matically, while most software visualizations are generated by human experts. When 
visualizations have been created by software, the generating software has been built to 
handle a certain small class of input (output from sorting algorithms for example as in 
[20]). 

When applied properly. Software Immersion can speed the learning curve faced by new 
project members. The architecture and organization of the system they are learning is 
no longer an abstract concept, it is something they can walk around in and inhabit. Soft- 
ware Immersion is similar in concept to emerging technology in use in Civil Engineer- 
ing. In [21], the author shows quantitatively that new construction project members 
come up to speed faster and perform fewer mistakes when an immersive, virtual con- 
struction environment is built from the building design. 

CHIME benefits from the synergy among these three conceptual components. Aware- 
ness mechanisms in Groupviews work to enhance Software Immersion, since partici- 
pants are now immersed not only in the software artifacts but into the actions of others 
around them as well. Making this information available helps to fulfill the Groupspace 
goal of aiding geographical and temporal dispersion among project participants. 
Groupspaces benefit from Groupviews' visualization aspects, as users are able to locate 
information in the underlying Groupspace by quickly navigating the Groupview. This 
allows them to learn "where" information exists. In addition, users are drawn to new 
information via notifications from the awareness mechanisms in a Groupview. 
Realization of this conceptual model is challenging. Groupviews are dynamic environ- 
ments. As artifacts are added, modified, deleted, and moved in the underlying Group- 
space, Groupview participants must find the virtual environment evolving as well. This 
may be as simple as periodically bringing new or newly modified artifacts to their at- 
tention, or as complex as "morphing" the world as they see it to a new layout, based on 
a major change to the underlying Groupspace layout. See [22] for a discussion of the 
transaction management and version control issues inherent in handling these dynamic 
notifications. Groupspaces face the problem of working with data in remote repositor- 
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ies which may or may not include any transaction or lock support — and can thus change 
at any time. The architecture designed to implement this model, described below, at- 
tempts to handle these challenges while maximizing the benefits of the CHIME concep- 
tual model. 

3 Architecture 

chime's architecture was designed around three main components, as illustrated in- 
Figure 1 . In this architecture, separate services are responsible for organizing artifacts, 
parameterizing those artifacts as virtual environment types, and dictating how the vir- 
tual environment appears to clients. We will describe each of the components in turn. 
The Xanth Data Service (evolved from our lab's previous work on OzWeb[24]) ad- 



Data Repositories 




Figure 1 CHIME architecture. 

dresses the data organization and hypermedia aspects of the Groupspace model. In 
Xanth, data is organized into a multi-rooted tree hierarchy of XML elements known as 
Xanth dataElements. Each dataElement refers to a single piece of information living in 
an external data repository of some kind (web server, configuration management sys- 
tem, document management system, relational database, etc.) The Xanth Data Service 
maintains an XML document (made up of these dataElements) which completely de- 
scribes the contents of the Groupspace. Figure 2 shows an example dataElement with 
the minimum set of fields filled in. 
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From the figure, we see that every dataElement includes a name, a unique id number, 

<dataElement 

name="README" 

id="1000" 

protocol="http" 

server="library. psl.cs.columbia.edu" 
port="80" 

path="/linux-2.0.36/README" 

hidden="false" 

parent="0" 

behavior="GET" 

Figure 2 XML description of dataElement 

as well as a "parent" field specifying its parent dataElement. The protocol-related fields 
(protocol, server, port, and path) describe what mechanism is to be used to retrieve the 
data associated with a particular dataElement. In the example above, the dataElement 
is named "README" and is an http-accessible file. The server, port, and path fields 
simply give location information for the dataElement. 

Xanth uses protocol plugins to implement the retrieval protocols specified in the da- 
taElements. Continuing our example, an HTTP plugin would be configured into this 
instance of Xanth. A basic protocol plugin is quite simple; all it can do is verify that the 
server, port, and path fields of a given dataElement are of the proper format for this pro- 
tocol. To become more useful, each protocol plugin may provide a set of "behaviors" 
for its dataElements. Without these behaviors, Xanth cannot perform any actions on the 
data. The HTTP plugin, for example, may provide behaviors for the basic HTTP meth- 
ods, namely GET, POST, PUT, etc. If the protocol plugin provides behaviors, it is ex- 
pected to add a "behavior" field to each of its dataElements' XML. This field contains 
a list of behaviors supported for this dataElement, and may be used by other compo- 
nents of the system to determine the actions the user can take with a given dataElement. 
In Eigure 2, we can see that the HTTP plugin has added the GET behavior to this da- 
taElement, indicating it is able to fetch the document on behalf of the user, if needed. 
To address the hypertext features of the Groupspace model, Xanth includes a Link Serv- 
ice which provides typed, n-ary, bidrectional hypertext links among elements. The 
Xanth Link Service maintains its own XML document made up of linkElements (see 
Figure 3). Each linkElement has 3 fields: a unique id number, a descriptive type field, 
and a list of dataElement id's which are part of this link. The hypertext model followed 
by the Xanth Link Service is different from the hypertext model underlying the WWW 
— in Xanth, hyperlinks are stored separately from the data they reference, while WWW 
pages embed link references inside their content. Xanth's model is richer. 
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supporting more sophisticated hypertext among artifacts. It is conceptually similar to 
the hypertext provided by the various Open Hypermedia Systems [23]. 

<linkElement 

id="924" 

type="Related Docs" 

dsElems="2048,1000"/> 

Figure 3 XML description of linkElement 

The second component of the CHIME architecture is the Virtual Environment Modeller 
(VEM) service. This service is responsible for parameterizing each dataElement with 
one of an extensible set of virtual environment types. These types are meant to be rep- 
resentative categories for the various parts of a virtual environment. To date, we have 
defined only three types; Component, Container, and Connector. 

In the VEM schema, Components are a base type; they are the default type given to eve- 
ry dataElement. Child VEM types "derive" from Component in the standard manner of 
Object-Oriented databases: they inherit fields from the parent type, and can be treated 
as an instance of that parent type, and also include their own fields as a form of special- 
ization. The type 'Container' (which derives from Component) is given to dataElements 
which, for the purposes of the virtual environment, have a set of elements inside of 
them. A Connector (which itself derives from Container) not only may contain ele- 
ments but explicitly connects two or more Containers. The VEM parameterizes each 
dataElement from the Data Service by adding an XML field called "VEMtype" to each 
dataElement. This information is used by the Theme Service (see below) to determine 
the role of each artifact in the resulting virtual environment. 

Note that despite dealing with virtual environment concepts, the VEM does not hard- 
wire a particular notion of how the virtual environment will present itself to a user. We 
have deliberately designed the VEM and Xanth DataService to remain neutral with re- 
gard to the final user interface and display mechanisms used to produce the virtual en- 
vironment from the Groupspace artifacts. To borrow a term from more industrial dis- 
ciplines, the CHIME architecture can be seen as an "assembly line" moving remote data 
into the Xanth Data Service (where their location information is stored), next into the 
VEM (which decides their eventual roles in the virtual environment) and finally to the 
CHIME Theme Service. 

The CHIME Theme Service is responsible for all aspects of the virtual environment cre- 
ated for and inhabited by the system users. The Theme Service is broken into two com- 
ponents, Theme Plugins which run in a CHIME client and a MUD service which runs 
in the CHIME server. The MUD service is quite simple. It is responsible only for keep- 
ing track of the locations of the system users (i.e. what VEM container element are they 
currently located in) as well as relaying chat messages to all users of a particular room. 
CHIME clients connect to the Theme Service and download available Theme Plugins. 
These Theme Plugins are then started in the client and are responsible for retrieving the 
XML document maintained by the Xanth Data Service. Prom this XML, the Theme 
Plugin lays out a virtual world according to the capabilities of the client. If the client 
has 3D capabilities, the theme plugin may build a 3D representation of the world and 
allow the user to walk through it. If the client is connected to the server via a very low 
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bandwidth connection, or is running on a slow system, the Theme Plugin may layout a 
textual world. In the CHIME architecture, all user interface decisions are left to the 
Theme Plugins at run time. 

4 Implementation 

Our initial implementation work on CHIME is complete. The architecture described 
above has been implemented and we have performed an initial experiment designed to 
test the system's scalability with data from a large software system, the Linux 2.0.36 
kernel. We have loaded our experimental implementation with source code, build in- 
structions (including Makefiles and more human-oriented instructions), documentation 
artifacts (both the informal documentation provided with the kernel source as well as 
well-known web pages providing tutorials and information about the Linux kernel), and 
web-based archives of the linux-kemel mailing list (used by the core developers and 
maintainers of the various kernel subsystems for technical discussions). All in all, this 
project includes over 1.2 million lines of source code and several hundred megabytes of 
documentation. Where possible, we have attempted to use CHIME's hypermedia capa- 
bilities to cross link (by hand) the external documentation artifacts with the source mod- 
ules they deal with. The aim of this experiment was to simulate a large, complex, on- 
going software effort using CHIME for its day-to-day work. 

Our initial CHIME client and Theme Plugin build a simple 3D virtual environment 
from this data. Source code modules are rooms in the environment, individual files are 
rendered as furnishings in those rooms. Project members explore artifacts by walking 
around the virtual environment, and can interact with the artifacts and each other. As 
of this writing, we have integrated only a few tools into our prototype, notably video- 
conferencing software to allow participants to communicate with each other, web 
browsing software for accessing the numerous web-accessible artifacts related to the 
Linux kernel, and a text editor allowing participants to edit source code. We intend to 
reuse our Rivendell web-based tool server[24], originally developed for the OzWeb 
system, to provide more sophisticated tool launch and management facilities in 
CHIME. 

5 Evaluation 

The implementation presented here is actually CHIME 2.0. We previously developed 
another version as a prototype in which the clients used Virtual Reality Modelling Lan- 
guage (VRML) browsers to interact with the environment. It provided a similar immer- 
sive experience to users, but the conceptual model underneath was not as fully devel- 
oped. Clients connected directly to the Groupspace; Groupview techniques were ig- 
nored. 

The most significant limitation of CHIME 2.0 is that the existing implementation ig- 
nores versioning and transaction management issues in the environment. We are ad- 
dressing this problem, as discussed in [22] and hope to incorporate an implementation 
quickly. 
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Another, less significant limitation is the absence of an easy mechanism for populating 
a CHIME instance with artifacts. The Xanth Data Service operates on XML, and in the 
current implementation is users are expected to provide XML describing new dataEle- 
ments to be added. We intend to address this limitation by creating a simple GUI-based 
mechanism for new artifacts to be added inside a CHIME environment. 

CHIME 2.0 is implemented entirely using Java 2 (aka JDK 1.2), and the initial Theme 
Plugin discussed here uses the SGI Openinventor 2. 1 api to provide 3D graphics capa- 
bilities. Our use of Java and Openinventor allows CHIME to be portable; although our 
primary development platform is Windows NT, we have successfully used CHIME on 
both Sun and SGI unix workstations. We have attempted, where possible, to utilize ex- 
isting technologies in constructing CHIME, including standard Java remote method in- 
vocation (RMI) for communications between CHIME server components, and Sun's 
XML processor [25] for all of CHIME's XML requirements. 

6 Related Work 

LambdaMOO [14] is prototypical of many Multi-User Domain systems, and many 
newer systems are still built around the original LambdaMOO implementation. Lamb- 
daMOO, through the use of an object oriented database and associated programming 
language, explored many of the ideas in Groupviews. We chose not to build on Lamb- 
daMOO, however, because the OODB underlying the system must contain all virtual 
environment components. This does not fit our Groupspace model for storage of the 
artifacts. 

Promo[26] builds a virtual environment interface from a software process definition. 
Rooms in the environment are mapped to subcomponents of the process. In the envi- 
ronment, artifacts are located in the rooms in which the process will utilize them (i.e. a 
room for compiling code will contain the code, etc.) This is the first work the authors 
are aware of which attempts to marry virtual environment techniques with software de- 
velopment environments. 

A number of Software Development Environments (SDEs) have been created over the 
years in both research and industry (see [6], [7], [8], [9], [5] for example). These differ 
from our conceptual model in that they assume that all artifacts will be managed by the 
development environment itself (or through tools specially "wrapped" to be called from 
the environment). In addition, existing SDEs do not provide the Software Immersion 
inherent in CHIME. 

Many research and commercial Groupware systems might at first glance appear to ful- 
fill our Groupspace model. Systems like Orbit[12], TeamRooms[l 1], eRoom[10], and 
Lotus Notes[27] do have much in common with CHIME Groupspaces, but these sys- 
tems store artifacts inside their servers. When they do allow reference to external data, 
it is often limited to a web link to external data. A number of such system have explored 
similar awareness mechanisms as our Groupviews. 

Microsoft NetMeeting[28] and other real-time collaboration tools provide a form of 
temporary Groupspace. While in a meeting, users can share applications (effectively 
making single-user COTS tools multiuser), use these tools to bring in data from external 
sources, and have some awareness of others' actions. These workspaces, however are 
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not persistent; when the last participant leaves a meeting, the workspace disappears. In 
addition, these tools do not scale well beyond a few users and a relatively small number 
of artifacts - bandwidth and memory issues in real time collaboration make this diffi- 
cult. 

Research into Software Visualization (and the related area of Algorithm Animation) 
looks at the design and development of techniques to show program code, algorithms, 
and data structures by using typography, graphics, and animation. The Software Im- 
mersion in our conceptual model for CHIME can be seen as a form of Software Visu- 
alization, as we are displaying the organization of software artifacts through the design 
of a virtual environment. [19] contains a good overview of research in this area. 

A number of research and commercial systems (see, for instance [29] and [30] utilize 
metadata style architectures to provide access to back-end, remote data. These systems 
are typically focused on query optimization and similar database research issues over 
remote data sources. This work is quite relevant to the Groupspace concept, as a pow- 
erful query facility optimized for use in a metadata-based system would be a useful ad- 
dition to the Groupspace model. 

An ongoing conference series discusses the use of Multi-User Domains for "real" work 
[31]. Research results from this community demonstrate many examples of the use of 
MUDs and virtual environment systems in engineering disciplines as well as other work 
areas. 

7 Conclusions and Future Work 

We have designed a conceptual model, feasible architecture, and performed an initial 
implementation of a metadata-based software development environment utilizing vir- 
tual environment techniques. The model is made up of three separate components: 
Groupspaces, which provide a persistent organization of software artifacts, 
Groupviews, multiuser user interfaces including awareness mechanisms, and Software 
Immersion, which creates an immersive environment from the artifacts populating a 
Groupspace. 

One area for future work on CHIME is support for easy population of a Groupspace 
with a set of objects from existing development efforts, as well as automatically gener- 
ating links between artifacts which seem to be related. A number of efforts in the Ra- 
tionale Capture and Reverse Engineering communities have explored this, and we in- 
tend to look into utilizing existing systems of this sort with CHIME. 

We plan to incorporate the Rivendell tool management system we have previously de- 
veloped[24] to provide tool launch and sharing capabilities to CHIME. Although our 
experiment with the Linux kernel involved a small number of tools, we envision much 
richer tool support for future CHIME incarnations. 

As mentioned in our evaluation of CHIME, we intend to incorporate research into ver- 
sioning and transaction support for virtual environments from[22]. This is the most 
glaring omission from the current implementation, and must be addressed quickly. 
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It has been the custom of our research lab to use our own tools to support our work. In 
this regard, we intend to use CHIME for our own development work, involving a small 
team of programmers (5-10) working on a number of separate but interrelated projects. 
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Abstract. The use of smart cards to run software modules on demand 
has become a major business concern for application issuers. Such down- 
loadable executable content needs to be trusted by the card execution 
environment in order to ensure that an instruction on a memory area 
is compliant with the definition of the data stored in this area {i.e. its 
type). Current solutions for smart cards rely on three techniques. For 
Java Card, either an off-card verifier-converter performs a static verifica- 
tion of type-safety, or a defensive virtual machine performs the verifica- 
tion at runtime. For other types of open smart cards, no type-checking 
is carried out and the trust is only based on the containment of appli- 
cations. Static verification is more ellicicnt and flexible than dynamic 
techniques. Nevertheless, as the Java verifier cannot fit into a card, the 
trust is dependent on an external third-party. In this way, the card se- 
curity has been partly turned to the outside. We propose and describe 
the FACADE language for which the type-safety verification can be per- 
formed statically on-card. 



1 Introduction 

In this section the specific domain of smart cards is described. For people not 
aware of smart cards, we briefly review the technology and the history of smart 
cards from embedded devices dedicated to specific applications up to open plat- 
forms for enabling the downloading of new services all during the card’s life. 

1.1 The Specific Domain of Smart Cards 

Smart cards form a specific domain by three ways we detail hereafter: their 
internal constitution, their interfaces, and their applications. 

A smart card is a piece of plastic, the size of a credit card, in which a single 
chip microcontroller is embedded. Usually, microcontrollers for cards contain 
a microprocessor (8-bit ones are the most widespread, but 16-bit and 32-bit 
processors can now be included in the chip) and different kinds of memories: 
RAM (for run-time data), ROM (in which the operating system and the basic 
applications are stored), and EEPROM (in which the persistent data are stored). 
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Since there are strong size constraints on the chip, the amounts of memory are 
small. Most smart cards sold today have at most 512 bytes of RAM, 32 KB of 
ROM, and 16 KB of EEPROM. This chip usually also contains some sensors (like 
light sensors, heat sensors, voltage sensors, etc.), which are used to deactivate 
the card when it is somehow physically attacked. 

In order to be usable, a smart card must be inserted in a card reader/ writer, 
which provides power to the card, as well as a clock signal. Also, any communi- 
cation between the terminal and the card goes through the card rcader/writcr 
in the form of messages exchanged from the terminal to the card (commands), 
and respectively, from the card to the terminal (responses). All these basic as- 
pects are strongly standardized, since cards are meant to be usable with a wide 
range of devices. The family of ISO 7816 standards are the references [3]. They 
standardize many features, from the positions of the contacts on the card to 
the transport protocol that is used to communicate between the card and the 
terminal. 

A smart card can be viewed as a “data safe” , since it stores data in a secure 
manner and it is used securely during short transactions. Its hardware is the base 
for its safety. The fact that the chip in a card is embedded with sensors in plastic 
and glue, and that all components are on the same chip makes a physical attack 
quite difficult. The software is the second barrier for its safety. The chip programs 
are usually designed for neither returning nor modifying sensitive information 
prior to be sure that the operation is authorized. In fact, most card applications 
use the card either to safely store data, or to process sensitive data. Most smart 
cards include some support for cryptographic functions, which allows them to 
secure their transactions with the outside world. More practically, cards are often 
used either to manage some kind of currency (money or tokens) or to identify a 
person. 

1.2 Smart Cards State-of-the-Art 

The specific domain of smart cards is close to the domain of embedded devices. 
Like embedded devices, smart cards are aimed toward the consumer electronics 
market, which requires from these systems even more and more convenience and 
flexibility. 

The methods, languages and tools for developing a smart card system share 
some characteristics with those of the embedded domain. Until recently, smart 
card codes were written in hand coded native assembler. All programs (drivers, 
operating system, libraries, applications) were developed in a monolithie piece of 
code burned in the ROM of the smart card. Therefore, not only traditional card 
systems are difficult to develop (low-level programming language, very reduced- 
feature microcontroller, specific code for every microprocessor) but also they 
cannot support evolution of their applications since all the application code is 
burned forever with the riintime engine in the ROM. 

Moreover, the production of such a “petrified” monolithic program dedicated 
to a specific hardware and with ad hoc functions for the application domain con- 
sumes most of the card development cycle. In order to issue a card application. 
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what is needed is (i) to write precise specifications, (ii) to write or re-write the 
basic software (akin to an operating system) for possibly multiple platforms, 
(iii) to develop specific functions for the application, and (iv) to verify this 
software before to deploy it on thousands or millions of cards. This process is 
time-consuming and costly. Since defining specifications for products that will 
be available long afterwards is risky, this process is a difficulty for the creation 
of new markets. As it requires a long time it also severely limits the ability of 
a card issuer to deploy rapidly new applications in accordance with the market 
needs. 

In order to cope with these market needs (to simplify, a reduced “time-to- 
market” and flexibility for card applications) new generations of smart cards 
(called open smart cards) have emerged during the last two years. Most notable 
efforts towards such smart card systems are Java Card [12,13], MultOS [5], and 
Smart Card for Windows [6] which provide application developers an opportunity 
to create applications on a common base of code. They contain a platform for the 
dynamic (i.e. on demand) storage and the execution of downloaded executable 
content, which is based on a virtual machine for portability across multiple smart 
card microcontrollers and for security reasons. 

While these new smart cards bring solutions regarding the market needs, 
they also introduce new problems for smart card makers. They provide solu- 
tions for card application developers by enabling them to program in high-level 
languages, on a common base of software (an abstract machine and application 
programming interfaces) which isolates their code from specific hardware. In that 
sense, they can reduce drastically the time to get new applications to market. 
They also tend to support both the flexibility and the evolution of applications 
by enabling the downloading of executable content in already deployed smart 
cards. This later characteristic requires more sophistication in term of security 
techniques sinec any program could potentially damage or misappropriate the 
whole eard system. This paper deals with this issue as summarized in the next 
section. 



1.3 Outline of this Paper 

Ensuring that a program cannot damage a system consists in denying it access to 
other memory areas than those reserved for its execution and data (containment 
or “sandboxing”). Containment relies on access controls to memory areas. It 
would be better to associate containment with a protection mechanism that also 
verifies that every instruction accesses data with respeet to their types. In fact, 
a more fine-grained protection consists in verifying that every instruction that 
accesses to a memory area is compliant with the definition of the data stored 
in this area (be. its type, or its class in an object-oriented system). This later 
technique has been popularized in the Java language [4]. 

In the field of open smart cards, this security issue is generally addressed 
by the use of three techniques. Current solutions for smart cards rely on three 
techniques. For Java Card, either an off-card verifier-converter performs a static 
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verification of type-safety, or a defensive virtual machine performs the verifica- 
tion at runtime. For other types of open smart cards, no type-checking is carried 
out and the trust is only based on the containment of applications. For tradi- 
tional smart cards (with all the code burned in their ROM), these techniques 
are not used because the confidence in the programs is based on the fact that 
they have been tested and verified before the delivery of the cards. 

In this paper, we propose a new architecture to build smart card code that 
is based on a typed intermediate language called FACADE. It aims at providing 
a coherent system for card software engineering enabling both, the production 
of cflicicnt traditional smart cards, and the building of type-safe downloadable 
content for open smart cards. 

Section 2 debates the security schemes used in open smart cards before detail- 
ing the FACADE architecture. In Section 3 we focus on the verification process 
and on the precise static semantics of FACADE. A formal model using the B 
language exhibits the type-safety properties to be checked. Finally, Section 4 
summarizes the paper and gives our forthcoming future work. 

2 FACADE Architecture 

2.1 Language Related Work 

The current open smart cards provide programmers with the ability to download 
programs dynamically. This characteristic aims at making smart card systems 
more convenient in two ways. On the one hand, they support the use of high- 
level languages (like Java with Java card, or Visual Basic with Smart Card for 
Windows) in spite of smart card constraints. On the other hand, they guarantee 
that the programs loaded in a smart card are not able to compromise the security 
of the other programs. 



Smart Card Compilation and Loading Process Though current smart 
cards are able to run programs written in high-level languages, they still have 
drastic limitations. Generally, the compilation of programs expressed in high- 
level languages produces a code unsuited to the available hardware. Two solu- 
tions have been used in the existing products. 

The first solution consists in defining programming languages dedicated to 
smart cards. These languages tend to be close to those used on workstations but, 
they force programmers to take into account some specific aspects of smart card 
microcontrollers. For example, the language MEL has been specifically defined 
to program the MultOS card operating system [5]. For convenience, a domain 
specific compiler can generate MEL code from C. 

Another solution consists in converting the code produced by a traditional 
compiler into a specific code adapted to the smart card constraints. For example, 
the Java class file format is too heavy to be loaded in Java Card. Thus, the class 
files produced by compilers will be treated by a specific converter in order to 




480 



G. Grimaud, J.-L. Lanet, and J.-J. Vandewalle 



get a compressed version of class files [14]. Smart Card for Windows is another 
example of the same technique. 

Figure 1 presents the process of compilation and loading associated with 
these two approaches. 
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Fig. 1. Java Gard and MultOS compilation processes 



Code Downloading and Smart Card Security Currently, the Java Card 
framework does not support Java-like dynamic class loading. The main problem 
is that a smart card environment is too small for running the Java class file 
verifier. Open card architectures hence propose a downloading framework with 
reduced flexibility, in which the download unit is the application (or package in 
Java) rather than a single class file. In addition, new frameworks for program 
distribution and downloading have been proposed. The two major proposals are 
outlined in Figure 2. 

The first solution is known as defensive virtual machine. All safety checks arc 
performed at run-time, hence replacing the pro-active static bytecode verification 
by a defensive run-time bytecode verification. For instance, before performing 
a write operation to a given memory area, the virtual machine checks the type 
and access rights associated with the data located in that target area. In Java 
Card, the checks are mostly related to typing; in MultOS, the checks are related 
to access right checks, since security is based on application containment. 

Defensive virtual machines have two major drawbacks. First, the number of 
run-time checks to be performed severely hinders their performance; then, some 
additional data (such as typing information) needs to be stored, which increases 
the memory requirements. As a consequence, this technique can only be applied 
to small programs, and this drastically reduces its flexibility and usefulness. 

The second solution consists in performing the security checks outside the 
smart card. Before to download a program onto a smart card, it is first transmit- 
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Fig. 2. Two solutions for open smart cards 



ted to a trusted third-party (usually the card issuer), which checks the program 
and certihes it. When the program is downloaded, it is accompanied with its 
certihcate. Before to allow this program to run, the smart card checks the valid- 
ity of the certihcate. This solution is currently the one that is proposed by the 
major card issuers. However, it has a severe drawback: the whole security of the 
smart card system relies on the security of the certihcation scheme. This solution 
is satisfactory for local deployments, but it can be unsatisfactory for long-term 
deployment of large scale applications. In fact, a central point of trust (the cer- 
tihcation authority) creates a single point of failure that can compromise all the 
security architecture in case of attack. Also, in some cases, this centralization 
is not well suited to application needs in which the various participants are not 
able to trust a same authority. Moreover, the implementation and deployment 
of a certihcation infrastructure is difficult and requires costly operations like 
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monitoring and maintenance. For these reasons, we argue that it would be far 
preferable to design a solution in which the security of the smart card system 
relies on the smart card system rather than on a certification scheme. 



Another Approach: FACADE A recent research work [10] proposes to adapt 
the Proof Carrying Code [8] technique as a mean for verifying on-card the type- 
safety of Java Card programs. This technique seems very promising, but the com- 
plexity of the on-card processing for a full type-checking makes it still difficult 
to implement on current smart cards. We propose to solve this problem by intro- 
ducing a typed intermediate language called FACADE in which the type-safety 
can be verified on-card. As, this language is intended to be a target platform for 
multiple high-level languages such as Java or Visual Basic, it also solves some 
interoperability problems between new open smart card systems. The FACADE 
language is also a part of a framework for producing card programs. Inside this 
framework, it has two main properties: (1) it is designed specifically for an im- 
plementation on reduced feature devices, and (2) it is designed to enable an 
efficient static checking. 



2.2 The FACADE System 

The FACADE language is targeted toward very small platforms, but it is not 
specifically targeted toward open smart cards. The advantage of using such an 
intermediate language is that it can be used for all kinds of smart cards and small 
devices. The language is secure enough to be downloaded into an open smart 
card, but it can also be used to produce optimized native code, to be included in 
the ROM of a smart card. The main application of such an idea would be to use 
an high-end open smart card to prototype an application, and then to produce 
an optimized dedicated code in order to obtain smaller (and cheaper) cards for 
mass production. 

In our system, the FACADE typed intermediate language is a central ele- 
ment, as shown in Figure 3. Programs written in various source languages are 
first fed into a language-specific off-card front-end which does lexical analysis, 
parsing, type- checking, and compilation into specific representations. Then, it is 
translated into the FACADE intermediate format. The off-card middle end does 
conventional optimizations. It also generates the elements that will be necessary 
to prove the type-safety of the program when it will be loaded into the smart 
card. At this step, the FACADE code may be used in two different directions: 

1. The on-card hack end for open smart cards verifies the type-safety of the pro- 
gram at loading time, and generates a dedicated code for the target hardware 
or software machine. 

2. The off- card back end for ad hoc smart cards used the FACADE code as the 
source code for producing a complete card code dedicated to be burned in 
ROM thanks to the usage of a ROMizer^ 

^ A ROMizer is a software used to produce ROM binary images of programs. 
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All the production tools are deliberately made independent of each other so that 
they may support different source languages and different target platforms. This 
paper only deals with the proof generator and verifier tools. Other components 
of this architecture are currently investigated by ongoing experimental research 
works. 




Fig. 3. FACADE architecture 



Using a common intermediate language to share compiler infrastructure is 
not a new idea. Many compilers have used or use a common intermediate format 
for multiple source languages {e.g., GNU gcc, Eiffel, Pascal, etc.). For a long 
time, Eiffel has used the C language as its intermediate language. For more 
advanced systems, specific languages have been defined. For instance, the FLINT 
architecture [11] is based on a typed intermediate format that supports higher- 
order functions and an advanced polymorphic type system. However, none of 
these languages is appropriate for small platforms such as smart cards. 

The main characteristic of FACADE is its ability to prove the type-safety 
of FACADE code using small hardware like a smart card. This feature is very 
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important in order to use the language as a target for the compilation of the 
Java language, whose security relies heavily on type-checking. Here, instead of 
trying to fit a standard Java type-chockcr on a smart card (as done in [9]), we 
have chosen to design a language which caters to the very specific needs of smart 
cards. In the following sections, we focus on the way in which type-checking can 
be performed on the FACADE language on a smart card. 



2.3 Principle of Type-Checking: Type Inference 

Program checks are usually based on data flow analysis. This analysis aims at 
determining the type of the variables at each program point {pp). The type of 
some variables changes during the execution of the processing because they are 
temporarily used for different computations. 



The Hierarchy of Types In the FACADE architecture, all information are 
associated with types. FACADE is an object-oriented intermediate language. 
Thus, types arc commonly defined by classes. The hierarchy of classes is an 
extensible data structure. Indeed, when new applications are loaded, they add 
their own classes to those already recorded in the card system. These extensions 
are done in a tree-based model. Each class has one and only one super class. 
Moreover, one particular class: CardObject is the top of the hierarchy. Thus, 
directly or indirectly, any class extends CardObject, which is the root of the 
inheritance graph. 

For the data flow analysis, we define a semi-lattice as a tuple L = (V, C,0) 
where V is the set of the class types, C is a partial order defined over V, and n 
is a binary operation defined over V. Two elements i,j G V are incomparable, 
if neither i C j nor j C i. We say that if i, j G V, j covers i if i C j and there is 
no k such that i C fc C j. In the diagram of an ordered set (V, C), two elements 
i,j & V are directly connected if one covers the other. The Figure 4 presents the 
ordered set of FACADE types. The greatest element CardObject of V is called 
the top element of V and can be written T. 



Type-Safe Operations A program is made of a sequence of elementary op- 
erations. Each one uses some number of input data and modifies some number 
of output data. In FACADE, there are only five distinct elementary operations. 
Following, for each of them we informally provide (i) their operational semantics, 
and (ii) their static semantics. 

1. Return VarRes 

(i) This operation ends the execution of the processing and returns the value 
of the variable VarRes. 

(ii) This operation is legitimate if the type of the returned variable [VarRes) 
is a subtype or equals of/to the return type declared in the method signature. 

2. Jump Labelld 

(i) This operation is an unconditional jump to another operation of the 
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program marked with the label labelld defined in a label table. 

(ii) This operation is dehned only if the label labelld exists and the value of 
the associated pp is valid. 

3. Jumpif V ar Labelld 

(i) This operation is a conditional jump. The V ar variable is of type Bool 
which determines if the jump must take place or if it must be ignored. 

(ii) If the label labelld exists and the variable V ar is of type Bool, this 
operation is correct. 

4. Jumplist V ar nbcase {Labelldl, Labelld2 . . .} 

(i) This conditional jump reaches the label whose sequence number, in the 
list of the labels, is equal to the variable Var. If the variable Var is negative 
or if V ar is A number greater than the cardinal of the label list (dehned by 
the immediate value nbcase), the operation does not perform any jump but 
increments the program point. 

(ii) If each label Labelldx exists, if the number of labels is less or equal 
than nbcase, and if the variable Var is of type Int, then the operation can 
be carried out. 

5. Invoke Var Res Var methodid {tabVar\,tabVar2 . . .} 

(i) This operation executes the method methodid on the variable Var, with 
the parameters tabVarl,tabVar2 . . . Var Res is the variable that contains 
the value returned by the method call. In fact, this operation of the interme- 
diate language FACADE could be translated in the card, according to the 
nature of the method methodid, either like a jump into another procedure, 
or like a set of elementary operations of the target machine. 

(ii) The call of a method is type-safe if first (a) the method methodid is 
declared in the class (the type) of Var, and second (b) the types of the 
variables tabVarl, tabVar2 . . . are included in those awaited by the method 
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methodid of the class of Var, and finally (c) the type of the variable VarRes 
is that turned over by the method methodid. 

An elementary FACADE operation is considered type-safe if the inferred 
types are conform with those defined by the static semantics. For instance, let 
i and j be respectively of types Int and IntArray; an incorrect operation can 
be Invoke j ,i,add, j. However, the addition method of the Int class takes an 
integer value as the second parameter. As IntArray is not a subtype of Int, such 
operation is illegal. In this other example, let i a variable of the type Short, the 
following operation is correct: Jumpli St i, 4, (11, 12, 13, 14) , because the 
type Short is a subtype of Int. 

In fact, the variable type checks may appear to be simple if the type of a 
variable do not change during the life span of the method. Unfortunately, it is 
not the case for all kinds of variables. We distinguish two kinds of variables, 
those which have an invariable type {e.g., the attributes of an object), and those 
which have a variable type {e.g, the variables used for temporary storage during 
the evaluation of complex expressions). In the first case we speak about Local 
and in the second, we use the term Tmp. 



Type-Safe Verifier and Smart Card Constraints A type verifier proves 
that each instruction contained in a program will be executed with values having 
the types expected by the instruction. The type verifier reads the program and 
notes for each elementary operation, the produced types and the consumed types. 

Generally the execution of a program is not sequential. This is why a tradi- 
tional verifier follows the arborescent path of each case of the program execution. 
The computation of this tree structure establishes with certainty the types of the 
variables consumed and modified by the elementary operations. When various 
paths end in a same program point, the safe-type verifier determines a single 
type for each Tmp variable. 

The fixpoint search is the most complex aspect of the type-checking algo- 
rithm. As shown previously, it requires the production of a complex data struc- 
ture, and thus the execution of a complex recursive processing. The smart card 
cannot perform this recursive processing, because of its hardware limitations. 
Therefore, no smart card proposes an operating system performing at load time 
type-safety checks. And if the type-safety checks are performed at runtime, they 
slow the virtual machine. 



A Two-Step Solution Some studies propose to distribute the type-checking 
of a downloadable code between a code-producer and a code-receiver. The most 
outstanding works in this field are Proof Carrying Code [8] (PCC) and Typed 
Assembly Language [7] (TAL). The main idea of these works is to generate by 
the code-producer a proof of the program safety. The code is transmitted with 
this proof to the receiver. The code-receiver checks the program using the proof. 
The interest of this technique is that the proof verification is much easier than 
the proof production [10]. 
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In our case, the proof is the result of the type combination for a given label. 
Like TAL, we propose to provide, with the transmitted program, a list of labels 
with a state of the Tmp variables for each of them i.e. for each program point 
which can be reached by a jump. The receiver performs a sequential analysis 
of the code. When an operation is marked by a label, the receiver checks if its 
current inferred types are lower or equal (relation C of the types hierarchy) to 
those defined by the proof. Moreover, when the receiver checks a kind of jump, 
it makes sure that its inferred types arc lower or equal to those defined for the 
label. A program is refused if a control fails. 

3 Type-Safety Verification 

In the FACADE approach, the verification process is split in two parts. The 
resource consuming algorithm {i.e. the label generator) is done on the terminal 
side (code-producer) while the verification of the labels and the type inference 
between two labels is done on the card side (code-receiver). The first algorithm 
is a traditional type reconstruction and verification of all the possible execution 
paths. For each label {i.e. each point that can be reached by a jump operation), 
the joint operation is computed for the untyped local variables. The label table 
is transmitted to the card with the FACADE code. At loading time, the card 
sequentially computes the inference and compares it with the label table. In case 
of divergence the card refuses the code. 

The verification process must ensure that every execution will be safe. It 
means that starting in a safe state each operation will lead the system in another 
safe state. For that purpose, a transition system describing a set of constraints is 
constructed. It defines a correct state for each operation (preconditions) and how 
the system state evolves (post-conditions). Preconditions and post-conditions 
define the static semantics of FACADE. In particular, it must be ensured that: 

• all instructions have their arguments with the right type before execution, 

• all instructions transform correctly the local variables for the next pp. 



3.1 Static Semantics of FACADE 

The system maintains some tables: the class descriptors {classDsc) and the 
method descriptors {methodD.nc). The class descriptors table contains the def- 
initions of the classes present in the card and the class hierarchy. The method 
descriptors table contains for every class the signatures of the already checked 
methods. Once a method has been accepted by the verifier, it is added in the 
method descriptors. 

We briefly introduce our notation. Programs are treated as partial maps 
from addresses to instructions. If Pgm is a map, Dom{Pgm) is the domain 
of Pgm. ypp G Dom{Pgm), Pgmpp is the value of Pgm at program point pp, 
which is written Pgm[pp — > u]. A program P is a map from program points to 
instructions. The model of our system is represented by a tuple; < pp,Tmp > 
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where pp denotes a program point (or program counter), and Trap the set of 
untyped local variables. The types of the typed local variables noted L are never 
changed, and therefore are not part of the program state. The veetor of map 
T contains statie information about the local variables. The vector T assigns 
types for the Tmp variables at program point pp such that Vi, {Tmp[i] : Tpp[i]). 
A program is well typed if there exist T that satisfies: T \~ P. 

A program is correct if the initial conditions are satisfied and if every in- 
struction in the program is well typed according to their static constraints: 

Dom{Ti) = {} 

V* G Dom{P),{T, t 1 P) 

P 

Following Figure 5 we give a formal definition for the control flow instruc- 
tions (Return, Jump, Jumplf, and JumpList) and for the invocation instruction 
(Invoke). There are two static semantics for the Invoke instruction depending 
on the kind of variable used for VarRes. 



3.2 OfF-Card Label Generation 

During this phase, we can construct the type information for local variables for 
each program point using a traditional type inference algorithm (for example, 
the Dwyer’s algorithm [2]) and then, we calculate for every label the value of 
T for every path. An inference point (ip) is a pair made of a program point 
associated with the T type table for that program point (written Textem)- They 
are collected in a table - the labelDsc table - which is sent to the smart card. 
When a new method has been loaded, it is verified that the label descriptors 
only reference valid addresses: Vi, (f G labelDsc\pp],i G Dom{P)). Finally, we 
can formally define the label descriptors by: 

ip G labelDsc, ip = {pP, Textem) 
pp G Dom{P) 

Textern — Tpp 

Consider the following example (see Table 1), this piece of FACADE code 
(we call it program P) uses two variables i and v in order to compute a loop 
from 0 to 100. 

Using a hxpoint algorithm, the inference procedure checks all the paths of 
the tree. In this example, it needs eight steps as illustrated Table 2. 

At step 1, i and v are Tmp variables for which the types are unknown before 
the operation. The type of the return variable for the method likelt from the 
class Int is Int. Thus, at step 2, i is assigned the type Trit. At step 4, a hrst 
choice is made by jumping at pp 3, the other choice is verified at step 8. At 
step 6, the state is different than the step 3 (for the same pp) thus, we have not 
reached a hxpoint. At step 7, states are identical, meaning that a hxpoint has 
been reached. Now, the pending path memorized at step 4 can be checked. 
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P[i] = Return VarRes 
i 7^ Card{P) 

VarRes : methodDsc[ThisM ethodId][ReturnType] 
i + 1 € Dom{P) 

tJVp 



P[i] = Jumplf V ar Labelld 
P[i] = Jump Labelld V ar : Bool 

Labelld 6 Dom{P) Labelld € Dom{P) 

T,iL P i + 1 e Dom{P) 

T,ih P 

P[i] = JumpList V ar nbcase Label 
V ar : Int 
nbcase : Nat 
nbcase > Card(Label) 

Vpp, {pp 6 Label => pp 6 Dom{P)) 
i + 1 6 Dom{P) 

T,i h P 

P[i] = Invoke VarRes V ar Methid tabV ar 
Methid : methodDsCciassDsc(Var) 

VarRes £ T 

Ti+i — Ti[VarRes — ^ methodDsc[MethodId][ReturnType]] 

Vi, (i € l..methodDsc[MethodId][nbvar] => tabVar[i] : niethodDsc[MethodId][i]) 

i + 1 e Dom{P) 

T,iL P 

P[i] = Invoke VarRes V ar Methid tabVar 
Methid : methodDsCciassDsc(Var) 

Ti+i = Ti 

VarRes € L 

VarRes : methodD sc[M ethodi d][ReturnType] 

Vi, (i G l..methodDsc[MethodId][nbvar] =^> tabV ar[i\ : methodD sc[M ethodi d][i\) 

i + 1 G Dom{P) 

T,i h P 

Fig. 5. Static semantics of FACADE control flow and invocation instructions 



For the pp 3, we have to make the combination between the two states 
T 4 [i,p] = {Int, Bool) (at step 6) and T 4 [i,p] = {Int.T) (at step 3). Using our 
semi-lattice, the greatest lower bound of T 4 [i,v] and [i,v] is Tf,xtem[i,v\ = 
(Int,T). Our labelDsc is made of two ip: (4, [Int,T]) and (3, [Int, Bool]). 

In order to optimize the size of the data sent to the smart card, the labelDsc 
table can be reduced. In fact, by verifying the definition-use paths, one can 
remark that the information on the variable v at step 6 is useless because v is 
always dehned before any use. Thus, it is possible to simplify the labelDsc as 
shown Table 3. 
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Table 1. A sample FACADE program 



pp 


Label 


Instruction 


Comment 


1 




Invoke i, 0, likelt 


i is set to 0 


2 




Jump labelO 


go to the test at LabelO 


3 


Label 1 


Invoke i , i , add , 1 


increment i 


4 


LabelO 


Invoke v, i, LtOrEq, 100 


test if i less or equal than 100 


5 




Jumplf V, labell 


if TRUE go to Labell 


6 




Return void 


otherwise, return 



Table 2. Off-card type inference procedure of the sample FACADE program 



Step 


PP 


Label 


Instruction 


T[i,v] 


1 


1 




Invoke i, 0, likelt 


(T,T) 


2 


2 




Jump LabelO 


{Int, T) 


3 


4 


LabelO 


Invoke v, i, LtOrEq, 100 


{Int, T) 


4 


5 




Jumpif V, labell 


[Int, Bool) 


5 


3 


Labell 


Invoke i , i , add , 1 


[Int, Bool) 


6 


4 


LabelO 


Invoke v, i, LtOrEq, 100 


{Int, Bool) 


7 


5 




Jumplf V, labell 


{Int, Bool) 


8 


6 




Return void 


{Int, Bool) 



Table 3. Simplification of the label descriptors table (labelDsc) 



ip 


pp 


'^extern[i^ '^1 


0 


4 


{Int, T) 


1 


3 


{Int, T) 



3.3 On-Card Type Inference and Verification 

The on-card FACADE verifier uses the off-card-generated label descriptors in 
order to reduce its footprint in memory as well as the time needed by this 
process. After receiving the application and its labelDsc, the verifier checks the 
conformity of the label descriptors. Then, it begins the verification process by 
running sequentially through the code. 

The procedure is the following (see Table 4): for each instruction the veriher 
checks if the program point is referenced into the label descriptor. In such a 
case, the table Textem replaces otherwise, the algorithm uses Tc- It applies 
the static semantics to verify the correctness of the code. After an unconditional 
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Jump, the following instruction must have necessarily a label, and then it replaces 

c by 1 extern • 



Table 4. On-card verification procedure of the sample FACADE program 



Step 


PP 


Instruction 


^extern 


Tc[i,v] 


1 


1 


Invoke i, 0, likelt 




(T,T) 


2 


2 


Jump labelO 




{Int, T) 


3 


3 


Invoke i, i, add, 1 


{Int, T) 


{Int, T) 


4 


4 


Invoke v, i, LtDrEq, 100 


{Int, T) 


{Int, T) 


5 


5 


Jumplf v, label 1 




{Int, Bool) 


6 


6 


Return void 




{Int, Bool) 



At step 2, before the Jump, Te has been inferred as [Jnt, T], At step 3, the 
program pointer is included in the label descriptors thus, Te is overloaded by 
the external table Textem- At step 4, the transfer function associated with the 
method LtDrEq applied on an Int changes the type of the return parameter 
to a Bool. At step 5, the precondition to a Jumplf is to have a Bool value as 
parameter, which is satisfied. 



3.4 Status 

The complexity of the verification process is 0{n) since the algorithm verifies se- 
quentially the instructions. Furthermore, the memory usage is reduced because 
the algorithm only needs the variables types of the eurrent pp. If the type in- 
formation and the applet have been modified, the verifier always refuses the 
code. 

In our architecture, the card security relies mainly on the verification process. 
Therefore, the design of the type-checker cannot suffer any error. In a recent 
paper [1], we specified a formal model of a Java Card bytecode verifier and 
we provided the proof of its correct implementation. As the Java Card and the 
FACADE data flow verification arc closed, we expect to achieve the formal proof 
of the FACADE verifier in a near future. This work is part of a Ph.D. program, 
and the results will be presented in a future paper. 

4 Conclusions and Future Works 

The FACADE project is an academic research effort supported by the Gemplus 
Research Lab. It investigates the issue of software production in the field of 
smart cards. We have introduced the specific domain of smart cards. We have 
reviewed the programming techniques used for the production of traditional 
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cards and open smart cards. Then, we have presented the FACADE framework 
for producing smart cards programs. 

This framework is based on a typed intermediate language dedicated to smart 
cards. The first advantage of this single intermediate format is that it enables 
migration paths between open and traditional cards, from a variety of source 
languages towards a variety of smart card runtime environments and hardware 
platforms. There are also other advantages in using a typed intermediate lan- 
guage. First, a rigorous typo system can be used to verify the safety of a program. 
Second, it is possible to abstract the programmers from a single source language 
if programs of different surface languages share the same runtime system based 
on a uniform type system. Finally, type safe languages have been shown to 
support fully optimizing code generation and can efficiently implement security 
extensions (access control lists or capabilities). 

This paper focused on the type-safety veriheation process. We have shown 
that it is split in two parts. The off-card part is resource consuming but it 
generates important information (the labels) in order to minimize the second 
part of the verification process. By this mean, the on-card part is able to perform 
the verification with the reduced features of a smart card. For the moment, we 
are working on the complete B model of the verifier. The expected result is a 
proof of the correctness of the verification process. 

Future works are experimental researches on the execution environment of 
FACADE programs. They include the generation of target code from the FA- 
CADE language and the implementation of the runtime system libraries. This 
work will give us metrics on code size and efficiency, but also an implementation 
of the operational semantics. In order to prove also the correctness of the exe- 
cution process, we intend to develop a formal model of the dynamic semantics. 
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Abstract. We present an automatic approach to verify designs of real- 
time distributed systems for complex timing requirements. We focus our 
analysis on designs which adhere to the hypothesis of analytical the- 
ory for Fixed-Priority scheduling. Unlike previous formal approaches, we 
draw from that theory and build small formal models (based on Timed 
Automata) to be analyzed by means of model checking tools. We are thus 
integrating scheduling analysis into the framework of automatic formal 
verification. 



1 Introduction 

During the construction of real-time critical systems, it is vital to ensure that 
timing requirements are satisfied by the system under development. Early anal- 
ysis methods arc essential because relying only on final-product testing is risky 
and costly. Indeed, the cost of repairing bad design decisions usually is very 
high when they are discovered during the final stages of the software process. 
Manual reasoning is error prone and simulations are useful but do not provide 
enough confidence. Thus, a rigorous verification of timing requirements for pro- 
posed detailed designs is a key aspect of the development process of such kind 
of systems. 

A typical example of a hard real-time requirement is schcdulability, that is 
whether its tasks can meet their defined deadlines. Nevertheless, task schedu- 
lability is not the only deterministic timing-requirement that need to be veri- 
fied. Indeed, high-level End-to-End timing requirements (i.e., requirements es- 
tablished on the system’s inputs and outputs, generally, involving more than 
one component at design level) and other safety properties "make or break” 
design correctness [18]. Some examples of these requirements are: 1) responsive- 
ness (bounds on response times for events); 2) timing requirements for sample 
and output rates; 3) freshness constraints (bounds on the age of data used in 
the system) ; 4) correlation constraints (the maximum time-skew between several 

Partially supported by KIT125 and ARTE, PIC 11-00000-01856, ANPCyT. 

O. Nierstrasz, M. Lemoine (Eds.): ESEC/FSE’99, LNCS 1687, pp. 494—510, 1999. 
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inputs used to produce an output); 5) availability of data or space when asyn- 
chronous coinniunication is performed; 6) no loss of signals; and 7) guarantee on 
the consistency of distributed data. 

We focus our proposal on designs which adhere to the assumptions of Fixed- 
Priority scheduling theory [23,19]. Fixed-priority scheduling is used by a large 
number of applications in the real-time field [23,2,22,27]. It has also inspired real- 
time extensions to O.S. standards like POSIX [26]. In particular, we propose an 
automatic verification technique for concrete architectures, featuring: 

— Event and time-triggered tasks distributed in a net of processing resources 
and scheduled by a preemptive fixed-priority policy (e.g.. Rate Monotonic 
[23]). Data and control are communicated through shared variables, queues, 
circular buffers, and signals. 

— Abstract description of tasks code (where relevant events do not necessarily 
occur at the end of them). 

— Description of known assumptions about the environment behavior. 

We aim at assessing whether complex safety properties are valid for non 
trivial designs built using the features mentioned above. We propose the basis 
for an efficient automatic formal approach to complement manual reasoning and 
simulations. In this way, more confidence is gained on the correctness of proposed 
designs. 



1.1 Prior Research 

The desire to predict temporal behavior has encouraged the development of a 
useful collection of results for run time scheduling (see for example [10,23]). 
Thus, scliedulability of applications satisfying studied assumptions can be effi- 
ciently analyzed. Several tools have been developed (e.g., [24,3,21], etc.). How- 
ever, these results and tools are mainly aimed at verifying schedulability. There 
is no general support for verifying distributed architectures for high-level tim- 
ing requirements whose satisfaction depends on the complex interaction among 
several components. 

On the other hand, the verification of real-time systems designs has been con- 
sidered an interesting application field for automatic state-space analysis. Some 
operational formal notations have been adapted to address the processor-sharing 
phenomena (i.e., preemption of tasks) and some tools have also been developed. 
Among other, we can mention RTSL [17], which is a process algebraic approach 
based on discrete time. The technique allows designers to define the scheduling 
discipline, and reachability analysis can be applied to detect exceptions like a 
missed deadline. We can also mention GCSR [6], which is a graphical language 
based on a discrete time version of a process algebra which takes into account 
fixed priorities to share resources. A further technique is VERUS [11], which is 
based on discrete labeled transition systems to model applications running on a 
Fixed-Priority scheduled processor. They use symbolic model checking to verify 
time properties and to obtain timing information. 
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There are other approaches [4,9] based on Constant Slope Hybrid Automata. 
By using this expressive dense-time^ formalism, it is possible to model applica- 
tions which do not necessarily satisfy the assumptions of most studied scheduling 
theories. The drawback is that reachability is not longer decidable (although the 
authors claim to obtain responses in most cases). Finally, let us mention RTD [16] 
which is a family of notations based on a dense-time high-level version of Petri 
Nets (HLTPN) inspired on POSIX standards [26]. The approach is mainly aimed 
at simulation, symbolic execution and bounded reachability (general reaehability 
is not decidable in HLTPN). 

Unfortunately, dealing with the concepts of elapsed and executed time (by 
using tick resolution in discrete time formalisms, and integrators in dense time 
ones) has a negative impact on the decidability or the tractability of the verifica- 
tion problems for the mentioned formal approaches (for example see discussions 
in [9,13,25]). Also, produced models are either monolithic or are composed of 
agents whose behaviors are heavily dependent on each other, since they share 
the same processing resource. Scalability becomes a very difficult issue since the 
whole model of the system must be analyzed. 



1.2 Our Approach 

In order to improve tractability of formal analysis, we decided to sacrifice the 
generality by fixing the architectonic style of the applications to be designed. 
We believe that there is a natural trade-off between sophistication of supported 
features and the ability to efficiently verify obtained designs. We chose to adhere 
to Fixed-Priority theory assumptions. Such assumptions suit a large number 
of applications in the real-time field [23,22]. This allows us to apply known 
scheduling theory and automatically derive simple and non-preemtive formal 
models from design notations. The generality of former approaches prevented 
authors from using this theory. 

Worst-Case Completion Time (WCCT or response time) of certain code areas 
is one of the results steaming from some scheduling analysis techniques. On 
the other hand, bounds for the Best-Case Completion Times (BCCT) can be 
derived from time estimates provided by designers. Our technique uses those 
calculated WCCT and BCCT to build an abstract model of the system based 
on Timed Automata (TA) [1] as kernel formalism. TA is a dense time formalism 
which is simpler and more tractable than Hybrid Automata and HLTPN. TA 
are supported by some well established model checking tools (e.g., [14,7]) which 
were successfully applied in some application fields such as protocol and circuit 
verification. 

In the resulting models, neither the scheduler nor the tasks priorities are 
explicitly represented. However, their influence on tasks response-times is taken 
into account as timing information for the TA. Task activities are then modeled 
as running in dedicated processors (maximal parallelism). That is, each task is 

^ Dense model of time has advantages to represent conservative abstractions of soft- 
ware, see [13] 
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modeled by a timed automaton whose transitions are the end of its relevant 
actions. By using expressive power of TA, we describe that transitions must 
occur within the best and worst-case response times of the corresponding event 
(i.e., BCCT and WCCT of associated actions). The model produced in this 
way is conservative, as it might have more behaviors than the actual system. 
Nevertheless, it is safe to verify ” nothing-bad-happens” properties (i.e., safety 
properties). In real-time systems usually most interesting properties are safety 
properties [20].^ 

Usually, designers want to know whether the system may behave in such a 
way that the analyzed property is violated. In our proposal, designers describe 
a timed automaton which synchronizes with those events which are relevant for 
the property (which are mainly the end of communication actions) and evolves 
to an Error location if an invalid behavior is present in the system.® TA allows 
designers to express rather complex timing requirements which involve more 
than one component. Thus, we use known tools to perform reachability analysis 
(Is Error reachable?) on the parallel composition of the system with the query 
automaton. It is important to say that given a query only a subset of compo- 
nents (TA) are needed to perform the verification process. The complexity of our 
approach mainly depends on the number of components involved by the require- 
ment, and not on the number of components of the system as seen in previous 
approaches. 

One of the major contributions of this work is the integration of two lines 
of research in real-time system’s verification: scheduling theory and automatic 
formal verification approaches. This proposal in some sense, enlarges the appli- 
cability of scheduling theory to deal with requirements which involve interaction 
among components. We also show how known scheduling results could be used 
to reduce the complexity of the models and consequently of their analysis. To 
the best of our knowledge, this is the first proposal to model preemptive sys- 
tems using a dense time formalism which does not support time accumulation 
constraints, a feature which is supported by Hybrid Automata or HLTPN. 



1.3 Structure of the Paper 

This paper is structured as follows. First, a working-example is presented in 
Sect. 2. The example will be used to illustrate the method through the rest of the 
sections. In Sect. 3, the basic notions of our kernel formalism. Timed Automata 
(Sect. 3.1) are outlined and a WCCT calcnlns is presented (Sect. 3.2). Sect. 4 
presents the framework to explain the modeling and verification method that 
is finally outlined in Sect. 5. Finally, concluding remarks and future work are 
drawn in last section. 

® Conservative modeling or analysis of systems is a well-known and useful technique 
to reduce complexity of verification process while preserving its correctness (e.g., 
[12,15,4], etc.). 

® The technique have also been used in untimed systems like in [12]. 




498 



V.A. Braberman and M. Felder 



2 An Example: A Simple Control System 

To illiistrato tho main ideas - and some of the features of our proposal - we 
present a working example consisting of the design of a control system for a 
simple industrial plant. 

In the system, a controlling/monitoring software runs on two processors con- 
nected through a net channel that transmits signals from Processor 1 to Pro- 
cessor 2 (sec Fig. 1). In it Processor 1, two periodic tasks sample data from 
sensors which reflect plant status (c.g., pressure and temperature). These data 
values are written into a monitored shared-variable called ShMem which is read 
periodically by a Displayer and an Analyzer task. The Analyzer may decide to 
send a signal to a remote processor where the actuator manager resides; this 
decision may be taken according to data values (e.g., a change of status that 
should be informed) . In Processor 2 there is a sporadic task that attends signals 
arriving from the net: The Automatic Valve Manager (AVM). It updates the 
local information about the plant which is stored into a shared variable called 
Status and decides which operation will be performed on the valve. In the same 
processor there is a subsystem that attends the human operator commands (an 
ON/OFF switch button). A sporadic task, Command Capturer (CC), receives 
the press signal generated by the button. It stores the request into a queue that 
is periodically served by the Command Manager {CM). That task acts on the 
valve depending on the Status value (e.g., if the value is cooling it does not close 
the valve). 

3 Background 

3.1 The Kernel Language: Timed Automata 

As it is shown later, we use Timed Automata (TA) as the kernel formalism to 
model and analyze temporal behaviors. What follows is a brief outline of the 
basic notion of TA (for a more thorough presentation, see [1,14]). TA are finite 
automata where time is incorporated by means of clocks. As finite automata, TA 
are composed of a finite set of states (called locations in TA literature) and a set 
of labeled transitions (changes of location). There is no notion of final locations 
since executions are infinite. TL'ansitions model event occurrences while clocks 
serve to measure time elapsed since these occurrences. A set of clocks (noted 
between braces {}) is associated to each transition to indicate which ones are 
reset when the transition occurs. A temporal condition (TC) is associated to a 
transition. A TC is a boolean combination of atoms of the form : a; ~ c where x 
is a clock, {<, >, =, <, >} and c £ Z . K transition may occur only when TC 
is true, if so it is executed instantaneously and associated clocks are reset. Time 
elapses at locations, and transitions are instantaneous. Also, TCs are associated 
to each location - called invariant and written inside the location - and determine 
the valid clock values for locations. Hence, it is possible to use an invariant to 
express that the control can not remain in the location any longer (a deadline). 
The language of an automaton is the set of timed words (infinite sequences of 
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Button Press 




Fig. 1. The System Architecture 



transitions time stamped over SR) accepted by the timed automaton and such 
that time diverges (i.e., the sequence of time stamps diverges). 

Given a pair of TA, A, B with disjoint set of clocks, the parallel composition 
{A II B) is built by means of the Cartesian product of their locations and the 
synchronization of transitions by common labels. The invariant of a compound 
location is the conjunction of the invariants of the components. The TC of a 
synchronized transition is the conjunction of the local conditions while the set 
of clocks to be reset is the union of the local sets. 

Although the problem of reachability of TA (that is whether a location is 
reachable from the initial location) is P-SPACE hard, several successful tools 
have been developed to solve reachability and other verification problems [14, 7], etc. 



3.2 Fixed-Priority Application Model and Worst-Case Completion 
Time Calculus 

The Fixed-Priority application model we base our notation on assumes that there 
is a fixed set of tasks. There are two kind of tasks: periodic and sporadic tasks. 
Sporadic tasks respond to signals which have a inininiuin separation assumption 
through a sporadic server scheme [23]. Sporadic Servers can be treated as pe- 
riodic tasks for the analysis. In our applications, the execution capacity of the 
sporadic server is the worst-case execution requirement to treat a signal, while 
the replenish time (which plays the role of the period in the analysis) is the 
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minimum separation time. In other words, we suppose that just one invocation 
can be released with the provided tickets (a concept of sporadic servers [23]). 

Tasks are scheduled under Fixed-Priority policy. A base priority is assigned 
to them. Task executing at a given priority level can be preempted by any task 
of higher priority. Tasks do not suspend themselves while executing. We assume 
that every job completes its execution within its period. The time required to 
perform scheduling, context switching and other overheads is ignored. Further- 
more, to simplify the presentation of this notation, there arc no explicit features 
to describe software mode change, jitter on clocks, phase offsets, or subtasks 
with different priorities. 

To have a mutual exclusive access to data, we assume an emulation of Pri- 
ority Ceiling Protocol [23]. PCP emulation scheme requires the critical section 
(monitor server code) to execute at a level slightly higher than the priority of 
any client task that may try to perform an operation on the shared resource 
[19]. PCP guarantees that at most one critical section can delay the completion 
of any given task. 

WCCT calculus aims at assessing with precision the schedulability of a set of 
periodic tasks running under Fixed-Priority scheduling [23,2]. Roughly speaking, 
the WCCT (or response time) for a task Vj is the least fixed point of an equation 
that takes into account the worst-case interference of tasks of higher or equal 
priority plus the blocking time due to lower priority tasks accessing to mutual 
exclusion areas: 



Ri= Y. + + 

T, hp{i) ^ 

where, Ri, Ci, Ti and Bi are the response time, execution time (a parameter 
provided by the designer), period and blocking time respectively, for task r^. The 
blocking time is the maximum between the execution times of all critical sections 
with equal or higher priority than r* that may be invoked by tasks with lower 
priority than r*. hp{i) is the set of task with priority equal or greater than Ti. 
In the schedulability analysis designers compare calculated values against task 
deadlines. Actually, the WCCT calculus we use is more sophisticated and it is 
based in [19]. The complexity of that calculus is 0[rn/n?L / (1 — U)) where n is the 
number of tasks, m the number of deadlines to be analyzed, L is the ratio of the 
longest task period to the shortest task period, and U is the task set utilization 

(i-e,ELi t)- 

4 Design Notation 

When developing a real-time system, designers usually deal with four different 
descriptions: 1) the system’s architecture; 2) the internal dynamics of its com- 
ponents; 3) the context assumptions; and 4) the queries or properties the system 
must satisfy. In the next sections, we present examples of such notations while 
we point out some of the essential issues which constrain their style and features. 
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4.1 Describing the System Architecture 

The main criterion for defining the user language was the ease of analyzing 
the resulting designs. Our technique is based on the use of known analytical 
scheduling theory for Fixed-Priority scheduling [23] to build a formal model. 
Thus, theory assumptions to calculate WCCT and the expressive power of se- 
lected target formalism restrain the language features (e.g., there is no dynamic 
task creation, tasks are not migratable, task can not suspend themselves, etc.). 
Although we believe that some limitations can be overcome, there is a trade-off 
between sophistication of supported features and the ability to efficiently predict 
the behavior of the resulting designs. 

We adapted elements of periodic application model given in the previous 
section to express execution architectures. The designs may be composed of pe- 
riodic and sporadic tasks distributed in a network of processors (see fig. 1). The 
language supports shared variables, ports (with Writtein, ReadFrom operations) 
and bounded queues (with AddTo and ExtractFrom operations) to achieve asyn- 
chronous message passing. Tasks are blocked only to achieve mutual exclusion 
and an error is returned if the action cannot be performed (for example writing 
a queue that is full, etc.). ^ Signals (with SendSignal operation) and timers (with 
SetTimer operation) arc also available for internal communication. 

Processing nodes communicate through a signal net in an asynchronous fash- 
ion (a sporadic server may attend the signal). The bounds for the transmission 
times are specified; in our example the latency is within 2 and 3 ms. The inte- 
gration of the plant to the controller can be expressed through ports sampled by 
periodic tasks or by means of a signal mechanism attended by sporadic servers. 

We apply a simple notation based on [5] to define a net of processors, and 
the distribution and connection of the components (see Fig. 1). 



4.2 Task Internal Description 

Task code is specified at a level of abstraction where only the communication 
patterns and the estimates on the execution times of actions arc described - 
neither data nor functionality arc specified. In [8] wc introduce an structured 
language to describe the external/internal actions performed by tasks. Those 
terms are translated into TA to perform the analysis. In this presentation, com- 
munication patterns are described directly using TA with a tree-shape topology 
(see Fig. 2). Such TA are then post-processed - as explained later in Sect. 5 - to 
obtain the complete behavioral model. Locations stand for sequences of internal 
and external actions. Internal actions are denoted only by their duration inter- 
vals: [l,u] where I and u are the lower and upper bound for the execution time 
(denoted just [1] when I = u). Note that, external actions are actually performed 
by the invoked monitor server. 

* Most finite and bounded processing-blocking abstraction could be easily incorpo- 
rated into our proposal (e.g., circular buffers). Time-out communication operations 
might be incorporated as well since WCCT can be calculated. 
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A transition stand for the event of finalizing the associated sequence of ac- 
tions of its source location. Transitions are labeled by using a convention that 
denotes the task performing the action, the instance of the communication media 
modified (if the either the object or task are unique for the action according to 
the design that information is omitted) and the success or not of the action. Ex- 
amples are: ShMemWriLtenBySampleri, QueueReadJDK , etc. Note also, that 
we can model data-dependent internal decisions using non-labeled transitions. 
In other words, the designer can replace non-determinism for data decisions. 

To express the Best-Case Completion Times of the actions, we use a clock, 
for example x, which is reset at the incoming transition of the location and 
tested at every outcoming transition: the clock greater or equal to the lower 
bound estimated by the designer (the minimum execution time to perform the 
associated sequence actions with no interference). 



^ [1]; ReadFrom(Sensori) 
x>=2 

SensorlRead SAMPLERl 

|x| 

[1]; Writeln(ShMera) 



ShMem WrittenBy Sampler! 



© 



ANALYZER 




[3];WriteIn(Status) 

x>=5 

StatusWritten 

9 :~ 

ValvePortWrittcnByAVM 



AUTOMATIC 

VALVE 

MANAGER 



COMMAND MANAGER 




Fig. 2. The Task Descriptions 



4.3 Describing the Context Assumptions 

Assumptions about the environment behavior where the software is embedded 
may also be modeled by means of TA. Although assumptions could be given in 
terms of a more user-friendly language, for the sake of presentation simplicity 
and expressive power, we decided to use TA. 
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Designers can also specify minimal separations between events, maximum 
number of events in a period and change of modes. ° 

In our example, we suppose that the operator can switch the button with a 
minimum separation of 15 ms, but in a period of 60 ms he/she cannot perform 
more than two switches (see Fig. 3). 



ENVIRONMENT 

AUTOMATON 



Press 




x>=15 & z2 >=60 
|z2.x] 



Fig. 3. The Environment Assumptions 



4.4 Querying the System Behavior 

In our proposal, designers ask whether the system may behave in such a way that 
the analyzed property is violated. The designer can describe a timed automa- 
ton which synchronizes with those events relevant for the property. The timed 
automaton will evolve to an Error location if the behavior of the system entails 
a potential invalidating scenario. We call those TA query automata. The labels 
which arc available to write a query arc any event appearing in the descriptions 
of tasks behavior plus labels that stand for the completion and release of tasks 
(e.g., AnalyzerCompleted, etc.), and the number of elements in queues (e.g., 
empty, lelem, 2elem, etc.). 

A query automaton should not constrain the behavior of the system. If a is a 
label mentioned somewhere in the query and I is a location, then there must be 
at least one transition labeled a leaving I that can occur at any time. To assure 
this property, a loop transition is automatically added at a location of a query 
for each label/condition omitted by the designer at that particular location. For 
sake of readability those transitions are not shown in the figures. 

According to our experience, automata based notations turned out to be 
simpler than most logics for describing sequences of events. By using TA, it is 
easy to express complex temporal requirements for control/data flows, to detect 
invalid use of non-blocking devices as queues, etc. In the example, let us consider 
the following queries: 

Bounded Response 1: When certain value is read by the Analyzer and sent to 
the Automatic Valve Manager, the valve port must be written within B ms. 

® Note that it is also possible to model separately physical events and the arrival 
of information into actuators and sensors. For instance, it is possible to take into 
account the time it takes a physical event to be informed by a sensor and the time 
it takes for an actuator to achieve a physical change. 




504 



V.A. Braberman and M. Felder 



Figure 4 shows an automaton that would detect any violating behavior in 
the system. 

Bounded Response 2: When a request is issued by the operator, it must bo road 
by the command manager within B ms. Figure 5 shows an automaton that 
accepts all the scenarios where an operator’s request is not read by the 
command manager within B ms. Note that it also detects scenarios where 
the notification cannot be written because the queue is full. 

Freshness; Figure 6 shows an automaton that detects the scenarios where the 
Displayer shows values of sensori older than F ms. 

Correlation: Analyzer takes a course of action based on two data values, one 
sampled by sensori and the other one by sensor 2 ] their ages can not differ 
in more than C ms. In the correlation query, we capture scenarios where these 
values used were sampled with a separation greater than C ms. Actually, the 
automaton of Figure 7 is one of the six queries needed to check correlation 
since it captures one of the six possible interleavings of the relevant events. 
In this case, the values read by the Analyzer follow this steps: Sensor 2 is 
read and registered by Sampler 2 and then sensor 1 is read and registered by 
Sampler i- 



AnalizcrCompIcted SignalSent_NOK 




Fig. 4. Bounded Response 1: Responsiviness to Analyzer’s signal 



QueueWritteii_NOK 




Fig. 5. Bounded Response 2: Responsiveness to Operator’s Commands 
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Fig. 6. Freshness Query: Age of Samplerl data showed by Displayer 




Fig. 7. Correlation Query: Time skew between Sampler2 and Samplerl data 



5 Formal Model 

This section sketches the way components are finally represented by TA. The 
key idea of this work is to use TA to analyze a conservative abstraction of 
system behavior. Following the ’’separation of concerns” criterion, we exclude 
the fact that processing resources are shared, but leave its influence on the 
temporal behavior of the system. Therefore, the system is modeled in a virtual 
maximal parallelism fashion. In our models the scheduler is not represented. Each 
task is modeled as running in a dedicated processor but taking into account 
slowdowns which were previously calculated using the appropriate scheduling 
theory (WCCT calculus). It turns out that a conservative model can be built by 
just using time constraints as maximum and minimum distances between events. 
No accumulation is needed to model executed time like in previous dense time 
approaches based on Hybrid Automata [4,9]. 

The tree-shape description of the control logic of a task is used as the basis to 
build the timed automaton which models the behavior of the task. In the case of 
a periodic task, a final location labeled Idle replaces the leaves and is connected 
to the root of the tree (i.e., the first sequence of actions) (see Command Manager 
automaton of Fig. 8). For a sporadic task, instead, a final location replaces the 
leaves and it models the waiting for the replenish time. This location is connected 
to an Idle location which, in this case, is the first location (See AVM in Fig. 8). 
There is a transition from Idle to the root. In the case of a sporadic task, this 
transition is labeled with the event to be attended. This event is ignored at the 
rest of the locations by just looping as replenish time has not arrived yet. 
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We also need to add deadlines for the actions performed by the task. Here 
WCCT help us to build the conservative model. A clock, for instance, 2 : is reset 
at the initial transition and at the transition that links Idle to the root location 
(the release instant of the task). For each location we build an invariant z < U 
where U is the WCCT of the last action of the sequence of actions represented 
by the location (WCCT are measured from the release time). This means that 
the control can not remain at that location for more time than the WCCT of 
the last action represented by that location. 



AVM 







Fig. 8. Formal Model 



Therefore, to build the formal model of the system it is necessary to syn- 
thesize WCCT for the communication points into the task. For that purpose, 
we use the results presented in [19]. They have extended the standard theory to 
cope with tasks composed by subtasks with their own priorities. Our tree-shape 
definition of tasks suits the hypothesis of that work, indeed, this is a particular 
case: Each branch of a tree is a sequence where internal actions - own code run- 
ning at the task priority - alternate with communication actions - server code 
running at a higher priority statically imposed by PCP emulation scheme [23]. 
Hence, a branch can be analyzed as a sequence of subtasks. The WCCT of a 
subtask representing a communication action provides the latest time at which 
the corresponding relevant events can occur relative to the instant of the last 
release of the task. 
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The net is modeled as a two-state automaton: full and empty (see Fig. 8). A 
queue is modeled as an automaton with as many locations as its size (see Fig. 8). 

The complexity of the explained procedure is dominated by the complexity 
of WCCT calculus (sec Sect. 3.2). Formally, the complexity of the translation 
procedure is 0{tn^L/{\ — U) + q) where t is the maximum number of nodes in 
a task description tree respectively; q is the sum of the sizes of queues, n is the 
number of tasks, L is the ratio of the longest task period to the shortest task 
period, and U is the task set utilization. In practice, the internal task description 
are quite simple and queues are relatively small. Therefore t and q are very small 
numbers and n becomes the scalability parameter of the complexity term. 



6 Verifying the Queries 

The property verification can be done by checking whether an Error location 
is reachable in the parallel composition of the TA that model the system, the 
environment, and the query automaton. This problem can be solved by using 
any tool that supports TA verification (e.g., [14,7]). Fortunately, in general, it 
is not necessary to make the parallel composition of all TA that compose the 
model of the application to solve a query. The sets of relevant components (i.e., 
tasks, queues and nets) and relevant events which are necessary to make a query 
verification can be calculated using an iterative method [8]. By using the set of 
relevant events, we reduce the tree-like description of those tasks identified as 
relevant to show just those events as transitions. Then, we can define a submodel 
which provides the same behavior than the whole model, up to the observation 
power of the query. 

We have observed that in most examples these subsets are rather small, 
reducing the verification effort. Reasons which explain this encouraging observa- 
tion are the use of non-blocking communication media and the fact that we do 
not model schedulers. Therefore, we generate rather ’’loosely coupled” models 
where only few TA are needed to verify properties. Thanks to this remark, we 
believe that the automatic verification may be useful for real-size systems when 
analyzed scenarios do not involve too many components. 

Table 1 shows which components are actually needed in the parallel compo- 
sition to solve each of the given queries. It also presents size parameters of the 
resulting models (number of locations, transitions, and clocks). 



Table 1. Components needed to solve the queries 



Query j Components j Locations j Trans j Clocks 



Bounded 1 


Sampler 1 \ Analyzer \ Net \ AVM 


86 


196 


6 


Bounded 2 


Environment \ CC \ Queue \ CM 


196 


744 


8 


Freshness 


Sampler^ \ Displayer 


30 


59 


5 


Correlation (1) 


Sampler^ \ Sampler 2 \ Analyzer 


84 


231 


7 
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We applied KRONOS tool [14] (version 2.1b) on a Windows 95 platform 
(Pentiuni 133Mliz, 32 Mbytes) to experiment with different values for constants 
B, C and F which appear in the queries. For example, we discovered in Bo^mded 
1 that the signal from Analyzer may not be listened by the AVM due to jitter. 
The results are summarized in table 2 and seem really promissory. 



Table 2. Results of the Queries 



Query I Values I Result I Exec. Time 



Bounded 1 


B = 40 


Error 


2 secs 


Bounded 1 


B= 400 


Error 


20 secs 


Bounded 2 


B= 50 


Error 


235 secs 


Bounded 2 


B= 60 


OK 


91 secs 


Freshness 


F = 60 


Error 


0.80 secs 


Freshness 


F = 64 


Ok 


0.90 secs 


Correlation (1) 


C = 60 


Error 


95 secs 


Correlation (1) 


C = 65 


OK 


50 secs 



7 Conclusion and Future Work 

We have presented a formal approach to verify a class of real-time distributed 
designs which adhere to the hypothesis of known analytical theory for Fixed- 
Priority scheduling. With this approach in mind, we built rather simple and con- 
servative formal models based on Timed Automata (TA) to improve tractability 
of verification. Furthermore, by choosing TA we were able to apply their deeply 
studied and developed analysis theory, as well as their practical tools ([14, 7], etc.). 
This paper reports the ’’foundations” required for practical application of the 
approach to real-life cases. Future suggested directions are: 

— Generalizing the proposed method to cope with a broader set of run-time 
scheduling theories like EDF [10]. We also think that it is possible to use 
these ideas to analyze systems which contain components scheduled at pre- 
run time. 

— Extending the method to support more features such as - among others - : 
static and dynamic offsets for task release, etc. 

— Gaining more accuracy by filtering the final model. It is possible to add 
automata to filter out some impossible temporal behaviors resulting from 
facts like precedence given by tasks priorities with harmonic periods [18], or 
even some constraints on control flow which may stem from data treatment 
not considered in our abstract model. 

— Providing a more user-friendly pattern languages to replace the direct use 
of TA to query the design. 
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To a certain extent, our proposal enlarges the applicability of scheduling the- 
ory to prove requirements which involve interaction among components. We have 
also shown how known scheduling results could help in reducing the complex- 
ity of models and their analysis (by making them more compositional). Further 
research in combining both scheduling theory and formal analysis seems highly 
recommendable. 
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Abstract. The liveness characteristics of a system are intimately related to the 
notion of fairness. However, the task of explicitly modelling fairness constraints 
is complicated in practice. To address this issue, we propose to check LTS (La- 
belled Transition System) models under a strong fairness assumption, which can 
be relaxed with the use of action priority. The combination of the two provides a 
novel and practical way of dealing with fairness. The approach is presented in 
the context of a class of liveness properties termed progress, for which it yields 
an efficient model-checking algorithm. Progress properties cover a wide range 
of interesting properties of systems, while presenting a clear intuitive meaning 
to users. 



1 Introduction 

Our research objective is the development of practical and effective techniques for 
modelling and analysing the behaviour of concurrent systems. We aim to support 
analysis based on the software architecture of a system, and believe that the analysis 
techniques need to be both accessible to practising software engineers, and supported 
by powerful automated tools. In particular, our approach is based on the use of La- 
belled Transition Systems (LTS) to specify behaviour and Compositional Reachability 
Analysis (CRA) to check composite system models. The architecture description of a 
system drives CRA in generating the model of the system based on its components 
[14, 22, 23]. The model thus generated can be checked against the properties required 
of it. 

Previous papers have addressed the problem of verifying safety and liveness prop- 
erties in the context of CRA [6, 7]. Our work on liveness property checking [6] takes 
the automata-theoretic approach to verification [30], adopted in a number of existing 
methods and tools [1, 15, 16]. The approach is based on the use of Linear Temporal 
Logic (LTL) formulas or Biichi automata to represent liveness properties. The LTS of 
a program is converted into a Biichi automaton and the LTL formula for some prop- 
erty F is translated into the Biichi automaton for —,F. The automaton corresponding to 
the intersection of the system and the automaton obtained for —,F is then constructed. 
If the resulting automaton is empty then the property F is not violated. 
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The tractability of the method is significantly affected by the fact that the Biichi 
automaton B is composed with the system. The size of the system can thereby increase 
by m times in the worst case, where m is the size of B. Moreover, the size of a Biichi 
automaton may increase exponentially as a function of the length of the LTL formula 
that it represents [16]. Although efficient algorithms exist for the automatic translation 
of LTL formulas into Biichi automata [12], none of these algorithms can guarantee to 
generate the minimal automaton. In such a setting, fairness is usually represented in 
terms of constraints introduced in the form of Biichi automata, which are also com- 
posed with the system [1, 16]. Besides complicating the task of modelling, this may 
further increase the size of the system to be analysed. 

To avoid burdening the users with modelling fairness constraints explicitly, we 
propose an optional predefined fairness assumption on the executions of an LTS 
model. Under this assumption, we have found that a specific class of liveness proper- 
ties, which we have termed progress, can be checked on the unmodified LTS of the 
system. This is an advantage compared to methods that may increase the state space of 
the system by the introduction of property and fairness automata. As the fairness as- 
sumption may be too restrictive in some circumstances, we introduce an action prior- 
ity scheme that relaxes it. This combination provides a simple, practical and effective 
way of dealing with selected types of liveness, and of taking fairness into account 
when performing liveness property checks. Most importantly, the technique is widely 
accessible since it requires little or no experience with temporal logic. 

Note that the class of liveness properties that can be expressed as progress proper- 
ties is a subset of those that can be expressed with LTL. Consequently, we do not see 
progress as supplanting the need for general LTL model checking. We simply propose 
it as a more accessible alternative to Biichi automata, whenever it covers the particular 
needs of the system developer. As discussed later in the paper, our experience and that 
of others [11] indicate that a large number of interesting properties of systems can be 
expressed and checked in terms of progress properties. 

The LTSA tool. The results of our work have been incorporated in an analysis tool - 
the Labelled Transition System Analyser (LTSA) [22, 23]. The examples used in the 
paper to illustrate progress checking were developed using the LTSA tool. We will 
briefly present how models of system behaviour are described for the LTSA. The tool 
uses a simple process algebra notation, called FSP for Finite State Processes, to define 
the behaviour of processes. As an aid to understanding, the LTSA supports the facility 
of drawing the LTS corresponding to an FSP specification. 

Fig. 1 gives the FSP specification and corresponding LTS of a server that may be 
accessed by two clients, A and B. The server may receive requests by either client A or 
client B (actions a.req and b.req, respectively). After receiving a request, the server 
processes it and produces a corresponding reply (actions a.reply and b. reply). The 
behaviour of SERVER is defined using action prefix (“->”), choice (“I”) and recur- 
sion. In the interests of brevity, we will not formally define their semantics here; the 
meaning in the example should be clear from the associated LTS diagram. 
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Structure. The next section describes how progress properties are specified, and how 
they are checked under the proposed fairness assumption. Section 3 introduces the 
concept of action priority and its use in progress analysis. Section 4 presents the Read- 
ersAVriters example that is used to illustrate and evaluate our approach. Finally, sec- 
tion 5 discusses related work, and section 6 closes the paper with conclusions and 
plans for future work. 

b.req 




FSP: SERVER = ( a.req->a.reply->SERVER 

I b . req- >b . reply- >SERVER ) . 

Fig. 1. FSP specification and LTS for process SERVER 

2 Progress Properties and the Need for Fairness 

The regular occurrence of some actions in a system execution indicates that system 
behaviour progresses as desired or expected. We would therefore like to be able to 
check on the model of a system that, in all possible executions of the system, such 
actions occur regularly. In the context of an infinite execution, regularly means infi- 
nitely often. A property that asserts that an action a is expected to occur infinitely 
often in every infinite execution of the system is expressed in LTL as nOa. We call 
properties of this type progress. Often, progress is not determined by a single action 
but by one of a set of alternatives. For example, a system may be considered to make 
progress if it outputs one of a set of values. Consequently, we define progress proper- 
ties in terms of a finite set of actions as follows: 

progress P = {a^,a 2 . .a„} defines a progress property P 

which asserts that in any infinite execution of a target system, at 
least one of the actions a^ , a^ . .a,, occurs infinitely often. 

The LTL formulation of the progress property P is nOCOj v ... v aj. Consider a 
very simple system 5 that consists of the server modelled in Fig. 1, and two clients A 
and B accessing it. The system can be expressed as the parallel composition of the two 
clients and the server, as illustrated in Fig. 2. Processes assembled with the II parallel 
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composition operator run concurrently by synchronisation on actions that are common 
to their alphabets and interleaving of the remaining actions. The LTS for system S is 
identical to the LTS of Fig. 1 . 

A = (a . req- >a . reply- >A) . 

B = (b . req- >b . reply- >B) . 

I I S = (A II B II SERVER) . 

Fig. 2. FSP specification of a system with two clients accessing a server 

For system S it is likely that a designer would expect both progress properties 
SERVE_A and SERVE_B to hold, where: 

progress SERVE_A = {a. reply} 
progress SERVE_B = (b. reply} 

The reason is that an execution where the requests of some client are ignored indefi- 
nitely is clearly undesirable. 

These properties do not hold for S (its LTS is identical to that of Fig.l). For exam- 
ple, SERVE _B is violated because S can generate an infinite execution that only listens 
to the requests of client A by always choosing the transition leading to state (1) when 
at state (0). This violation corresponds to a scheduler that is consistently biased against 
a specific enabled transition when given a choice. However, any reasonable scheduler 
should implement some notion of fairness when choosing between sets of possible 
transitions. As it is not possible to express fairness explicitly in the standard LTS 
model, we make the following fairness assumption in order to check progress: 

Fair Choice: If a choice over a set of transitions is executed infinitely 
often, then every transition in the set is executed infinitely often. 

As discussed in section 5, fair choice corresponds to a strong fairness assumption on 
the system transitions. Under fair choice, progress properties SERVE_A and SERVE_B 
hold for system S. Consider now the case where in system S, client B is substituted by 
client B_EAULTY that may crash as modelled by the following FSP expression: 

B_FAULTY = (b. req- >b. reply- >B_FAULTY | b . crash- >STOP) . 

For simplicity, we assume that B_FAULTY does not crash while waiting for a reply. 
The LTS of system S in this case is illustrated in Fig. 3. We can see that in this system, 
progress property SERVE_B is no longer satisfied: action b. reply can only occur 
finitely many times in any fair infinite execution that reaches states (1) or (2) at some 
point. The set of states {1,2} is called a terminal set of states, because each state is 
mutually reachable, but no state outside the terminal set can be reached from any of 
those states. We will prove later that in finite state systems, any fair infinite execution 
reaches a terminal set of states. As a result, the only actions that are repeated infinitely 
often in such executions are actions that label transitions between states of the termi- 
nal set. 
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a.req 




Fig. 3. System consisting of a server and two clients, one of which may crash 

The LTSA tool reports the violation as follows: 

Progress violation: SERVE_B 
Trace to terminal set of states: 

a . req 

b . crash 

Actions in terminal set: 

{a.req, a. reply} 

This violation does not correspond to a real problem with the system. It is obvious that 
reply actions cannot occur infinitely often if, after some point, requests are no longer 
being issued. So the desired property is in fact that, if requests from B occur regularly, 
then replies to B must also occur regularly, i.e. n^b.req ^ u^b. reply. We call this 
form of progress property conditional progress, which we define as follows: 

progress P = if {a^,a2..a,,} then {bj,b2..b„} 
defines a progress property P which asserts that in any infinite 
execution of a target system, if any of the actions a^ , a^ . .a,, 
occurs infinitely often then at least one of the actions b,,b2. .b,, 
also occurs infinitely often. 

Progress property SERVE_B can therefore be restated as follows: 

progress SERVE_B = if (b.req) then (b. reply} 

This property is satisfied by system S, since after B_EAULTY crashes, it stops making 
requests to the server. The property therefore makes sure that, when B_FAULTY is 
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alive, its requests are never consistently ignored, which is what the user wishes to 
check 

In the following, we formally describe and prove the checking mechanism for prog- 
ress properties for a system executing under fair choice. 

Labelled Transition Systems: 

Let States be the universal set of states. Act be the universal set of observable action 
labels, and Act^ = Act u{x), where t is used to denote an action that is internal to a 
subsystem, and therefore unobservable by its environment. An LTS of a process P is a 
quadruple {S, A, A, q) where: 

• 5c States is a finite set of states, 

• A = aP vj { X ) , where aP c Act is the communicating alphabet of P, 

• Ac5xAx5, isa transition relation that maps a state and an action onto another 
state, 

• q E S indicates the initial state of P. 

For an LTS P = {S, A, A, q), we say that action aeA is enabled at a state seS, iff 3 
s'eS such that {s, a, s')eA. Similarly, we say that a transition (s, a, s')eA is enabled at 
a state teS iff t = s. 

We call an execution of P an infinite sequence q^a^qfiy.. of states and actions a. 
such that q=q and Vi>0, {q., a., q.^^) € A. A trace of P is a sequence of observable 
actions that P can perform starting from its initial state [17]. 

A state r is reachable from a state s in an LTS P= {S, A, A, q), iff ((r = s) or (3 aeA 
and teS, such that (s, a, t) e A and r is reachable from t)). For a state seS, Reach- 
able{s, P) denotes the set of states that are reachable from s in P, i.e. Reachable{s, P) 
={ceS I r is reachable from i in P}. An LTS of P = (5, A, A, q) transits into another 
LTS of P' = {S, A, A, q' ) with an action a e A iff {q, a, q') e A. That is: 

• (5, A, A, q) — 2-y A, A, q' ) iff {q, a, q') e A. 

Definition - A terminal set of states C^S in an LTS P = (5, A, A, q) is a strongly 
connected component with no outgoing transitions i.e. 

• V S' e C, C c Reachable(s, P)), and 

• V s g C, Reachableis, P) ^C. m 

It follows directly from the above definition that C is a terminal set of states in an LTS 
P iff V s e C, Reachable(s, P) = C. 

Terminal Set Theorem - Let P = {S, A, A, q) be a finite-state process that executes 
under “fair choice”. If w is a legal infinite execution of P, then the set of states that 
appear infinitely often in w forms a terminal set of states in P. 



' If in addition we wanted to check that a reply is received for each request, we would combine 
the progress property with a safety property [7], which would ensure that a reply must occur 
in any interval defined by two requests. 
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Proof: Let c 5 be the set of states that occur infinitely often in w. Since P consists 
of a finite number of states, then 5, is not empty. With fair choice, the fact that states 
in S', are repeated infinitely often in w implies that all transitions that are enabled at 
these states also occur infinitely often in w. This means that all states that are reach- 
able from states of S, in P occur infinitely often in w. We conclude that Vie5',, Reach- 
able{s,P) c 5,. It is also straightforward that since all states in are repeated infinitely 
often in w, then every state in is reachable from any other state in 5,, and therefore 
Vj-e5,, 5, c Reachable(s,P). We conclude that Vie^j, Reachable{s,P) = S, and there- 
fore 5, is a terminal set of states. ■ 

From the Terminal Set Theorem we conclude that a fair infinite execution w is ob- 
tained by repeating infinitely often states in a terminal set of states. As a result, the 
actions that occur infinitely often in w are exactly those actions that are enabled at 
states in the terminal set. Therefore, a property “progress P = {a^, . . a„}” is 

satisfied iff for each terminal set of states C in the LTS of the system, the following 
holds: Bi'eC, such that some action ae {a^, a^ . . a^} is enabled at s (we say that a is 
enabled in C). Similarly, a property “progress P = if (a^ja^. .a^^} then 
{b^,b 2 - -b^}’’ is satisfied iff in the LTS of the system, there is no terminal set of 
states where some action in { a^ , a^ . . a,, } but no action in { b ^ , b^ . . b,, } are enabled. 

The algorithm that decides whether a progress property is satisfied is therefore 
based on the computation of the terminal sets of states of a system. Terminal sets are 
found by computing the strongly connected components in the LTS graph and apply- 
ing the additional criterion that no transition exists to a state outside the strongly con- 
nected component. Tarjan [29] showed that strongly connected components can be 
computed in linear time. Consequently, the check that progress properties hold is effi- 
cient. Note that it is only necessary to compute the terminal sets once to check any 
number of progress properties. As diagnostic information in case of progress viola- 
tions, the LTSA tool displays a trace of actions leading to the terminal set together 
with the actions enabled in the set (see sample LTSA output above). 

The LTSA performs a default progress check when no progress properties are ex- 
plicitly specified. This consists of checking progress with respect to all actions in the 
alphabet of the system. For a system S, this is equivalent to checking that V a&aS, 
progress P^= { a } . If no actions in aS are missing from terminal sets of states in S, 
then liveness is guaranteed in the system, since all actions always eventually occur. 
However, the liveness guarantee is with respect to the assumption of fair choice. We 
will see in the next section that liveness problems related to scheduling only become 
apparent when the system model is augmented to reflect adverse conditions. 



3 Action Priority 

The progress checking mechanism proposed in the previous section is based on the 
assumption of fair choice. This assumption corresponds to strong fairness on the sys- 
tem transitions, which is often too restrictive to be practical [27]. In fact, practical 
schedulers in computing systems do not implement fair choice [2]. This means that 
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some executions that may be exhibited by the system will be ignored by the checking 
mechanism as unfair. To find problems with such executions, we propose a simple 
action priority scheme that allows the user to “stress” a system by applying adverse 
scheduling conditions. With our scheme, a set of actions in a process is given higher or 
lower priority than the remaining ones in the process alphabet. We introduce the fol- 
lowing abbreviations: 

P ^ to mean that 3 P' such that P ^ P' 

P io mean that B P' such that P ^ P' 

The low (high) priority operators » («) take as arguments a process P = (5,, A,, A,, 
and a set of actions K c Act, and return process P»K = (5,, A,, A, {P«K = 
{S^, A,, A, q^)), where the semantics for A are given by Rule 1 (Rule 2) below: 

Rule 1: Let a eAct^. Then: 

P — P' 

b&{A-K), P h )) 

P»K^-^P’»K 

Rule 2: Let a eAct^. Then: 

P^^ P' 

if ({a e K)or(y b e K, P h- )) 

P«K—P-^P’«K 

Intuitively, P»K expresses the fact that actions in K have lower priority than the 
remaining actions in aP. As a result, at any state where multiple actions are eligible, 
actions in K are ignored unless it is not possible to execute any action in aP-K instead. 
In contrast, in P«K, actions in K have high priority, so actions in aP-K are only se- 
lected when it is not possible to execute some action in K instead. 

Action priority is thus used in our approach to force specific transitions to be taken 
when a choice is possible. Let P be the original system to be checked, and P' be the 
result of applying action priority to P. Then selected unfair executions of P will corre- 
spond to fair executions of P'. These unfair executions of P can therefore be checked 
with our mechanism by checking system P' under fair choice. 



4 Example: ReadersAVriters 

To illustrate our approach to progress analysis using action priority, we will use the 
well-known ReadersAVriters problem. This is concerned with access to a shared data- 
base by two kinds of processes. Readers execute transactions that examine the data- 
base while Writers both examine and update the database. For the database to be up- 
dated correctly. Writers must have exclusive access to the database while they are 
updating it. If no Writer is accessing the database, any number of Readers may con- 
currently access it. Access to the database is controlled by a read/write lock which 
processes must acquire before accessing the database. The FSP model for such a lock, 
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together with the processes that acquire and release it, is defined in Fig. 4. The system 
consists of the parallel composition of the user processes with the lock. The process 
READWRITELOCK is defined as a choice among a set of guarded actions controlled 
by the variables writing and readers. The action for a reader to acquire a lock is only 
permitted when writing is false indicating that the lock has not been acquired by a 
writer. The action for a writer to acquire the lock is only permitted when the lock has 
not been acquired for either read or write access {readers==0 && \writing). The LTS 
generated for the composition READERS JWRITERS is depicted in Fig. 5. 

const Nread =2 // Maximum readers 

range R = 1. .Nread 

const Nwrite=2 // Maximum writers 

range W = 1. .Nwrite 
range ReadR = 0.. Nread 
range WriteW = 0.. Nwrite 

READWRITELOCK = RW[0] [False], 

RW [readers : ReadR] [writing : Bool] = 

(when ('writing && readers<Nread) 

reader [R] .acquire -> RW [readers + 1] [writing] 
|when (readers>0) 

reader [R] .release -> RW [readers - 1] [writing] 

I when (readers = = 0 && 'writing) 

writer [W] .acquire -> RW [readers] [True] 

I when (writing) 

writer [W] .release -> RW [readers] [False] ) . 

USER = (acquire -> release -> USER) . 

I I READERS_WRITERS = 

(reader [R] :USER| | writer [W] :USER| | READWRITELOCK) . 

Fig. 4. Readers/Writers system model 



The progress properties of interest in this system are that writers can always acquire 
the lock and that readers can always acquire the lock. These properties can be speci- 
fied as: 

progress WRITER = {writer [W] .acquire} 

progress READER = {reader [R] .acquire} 

The progress property WRITER is satisfied if any writer in the range W acquires the 
lock. The property READER is satisfied if any reader in the range R acquires the lock. 
A progress check of these properties against the READERSJWRITERS system discov- 
ers no violations. Now we will examine the behaviour of the system under adverse 
conditions. For the READERS_WR1TERS system, these adverse conditions occur 
when there is always competition for the lock. This happens when either the lock is 
requested frequently or the lock is held by processes for long periods. To model these 
conditions, we give release actions for both readers and writers lower priority than 
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acquire actions. Consequently, in any choice between acquiring and releasing the lock, 
acquiring it will have priority. This is described by: 

I |RW_PR0GRESS = READERS_WRITERS 

>>{reader[R] . release , writer [W] .release}. 

Progress analysis of this system results in the following violation: 

Progress violation: WRITER 
Trace to terminal set of states: 
reader . 1 . acquire 
Actions in terminal set: 

( reader . 1 . acquire , reader . 1 . release , 
reader . 2 . acquire , reader . 2 . release } 

This is the writer starvation situation in which writers do not get access because the 
number of readers with read access never drops to zero. In this simple example, the 
terminal set of states (3,4,5) causing the violation can be seen in the LTS of 
RW_PROGRESS depicted in Fig. 7. 



reader. 1. acquire 




Fig. 5. LTS tor READERS JWRITERS 

The problem of writer starvation can be fixed by making readers defer to waiting 
writers. To detect waiting processes, we modify the definition of USER processes such 
that they request access to the lock before attempting to acquire it: 

USER = (request- > acquire -> release -> USER) . 

The revised definition of the lock that uses this information is listed in Fig. 6. The new 
version keeps a count of waiting writers ww. Readers only acquire access if there are 
no writers waiting {\writing && readers<Nread && ww==Q). This new version of the 
lock when checked under the same conditions no longer detects a violation of the 
progress property WRITER. However, it is now possible for readers to starve: 
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Progress violation: READER 
Trace to terminal set of states: 
reader . 1 . request 
reader . 2 . request 
writer . 1 . request 
writer . 2 . request 
Actions in terminal set: 

{writer . 1 . request , writer . 1 . acquire , 
writer . 1 . release , writer . 2 . request , 
writer . 2 . acquire , writer . 2 . release } 



READWRITELOCK = RW[0] [False] [0] , 

RW [readers : ReadR] [writing : Bool] [ww:WriteW] = 

(when ('writing && readers<Nread && ww==0) 

reader [R] .acquire -> RW [readers + 1] [writing] [ww] 
when ( readers >0) 

reader [R] .release -> RW [readers-1] [writing] [ww] 
when (readers==0 && 'writing &&ww>0) 



when 



when 



writer [W] .acquire -> 
(writing) 

writer [W] .release -> 
(ww<Nwrite) 

writer [W] .request -> 
reader [R] .request -> 



RW [readers] [True] [ww-1] 

RW [readers] [False] [ww] 

RW [readers] [writing] [ww+1] 
RW [readers] [writing] [ww] ) . 



Fig. 6. Revised READWRITELOCK 



The problem of reader starvation can of course be fixed by introducing a “turn” vari- 
able that lets readers and writers run alternately when competition exists for the lock. 
Such a system should satisfy both the READER and WRITER progress properties. 
Examples of conditional progress properties related to the READERS_WR1TERS sys- 
tem are shown below: 

progress WREL[i:W] = 

if (writer[i] .acquire} then (writer[i] .release} 
progress RREL[i:R] = 

if (reader [i] .acquire} then (reader [i] .release} 

The progress properties assert for each writer and for each reader that, if they regu- 
larly acquire the lock, they must also regularly release it. None of these properties is 
violated by the two versions of the system presented. 

The checking mechanism that we have proposed has a number of advantages when 
compared to the approach based on Biichi automata. In the READERSJWRITERS 
example, each of the progress properties has to be checked separately if Biichi auto- 
mata are to be used for verification. The Biichi automaton for the negation of property 
WRITER if)n— [(writer. 1 .acquire v writer. 2. acquire)) is illustrated in Fig. 8. The tran- 
sition ©WRITER is used to mark the accepting state (1) of the automaton [6]. Note 
that when fair choice is assumed, a complete automaton must be used for verification. 
This is necessary since when a transition is undefined in the automaton, a non-terminal 
set of states may become terminal. A Biichi automaton can always be made complete 
by adding one state. 
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reader.1 .acquire 




Fig. 7. LTS for RW_PROGRESS 

The system READERS _WRITERS II WRITER consists of 18 states. The size of the 
system has therefore increased by 3 times, which corresponds to the size of the Biichi 
automaton. For large systems, such an increase is significant. Additionally, in our 
approach, a single graph exploration is sufficient to check any number of progress 
properties, which is not the case with Biichi automata. 

Finally, it should be noted that safety analysis must be performed on a system be- 
fore action priority is applied for progress analysis purposes. Since action priority 
removes transitions, it may remove erroneous system behaviour. 



writer.2. acquire 
tau writer. 1. acquire 




writer.2. acquire writer.2. acquire 

Fig. 8. Biichi automaton used for checking progress property WRITER 



5 Related Work 

Progress. Manna and Pnueli classify properties of programs into a hierarchy, where 
each class is characterised by a canonical temporal formula scheme [24], They associ- 
ate the term progress with several classes of this hierarchy. These formulas do not 
always correspond to liveness properties in the safety-liveness classification. Their 
work gives a detailed description of the differences between the two classifications. In 
fact, our progress properties are a subclass of the properties referred to in [24] as re- 
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sponse. The notion of progress also appears in Unity [5], where selected types of for- 
mulas are handled, and classified as safety and progress. Their progress properties 
correspond to LTL properties of the type D(a=^(>b) (leads to) and allb (ensures), 
where U denotes strong until. 

SPIN [18] uses the notion of progress in a similar context to ours. The tool provides 
the facility to mark selected states of processes as progress states. It then checks that 
nOprogress, where progress is true in a system state if at least one of the system proc- 
esses is in a progress state. The SPIN liveness checks also incorporate a weak fairness 
assumption with respect to processes. The different fairness assumption and the fact 
that we specify progress in terms of actions rather than states are largely determined 
by the difference in analysis approaches. SPIN uses an on-the-fly approach to analysis, 
which preserves information about states in individual processes, whereas we use 
CRA, where this information is not preserved under composition. 

Our approach differs significantly from that of SPIN both in terms of expressive- 
ness, and algorithmically. Currently, SPIN performs progress checks by introducing a 
pre-defined Biichi automaton for progress. As a result, the state space of the system is 
affected (this also holds for the original algorithm presented in [18], where a two-state 
demon process was added to the model to determine different modes for the checking 
algorithm). Unlike our approach, SPIN’S progress mechanism can deal neither with 
the conjunction of a number of progress properties, nor with conditional progress. For 
example, it cannot check if at least one progress state from each component process 
must occur infinitely often in the executions of a system. In SPIN, such properties are 
supported by the LTL model- checking mechanism. Further, we provide the option of 
action priority that permits a system to be checked under adverse scheduling condi- 
tions. 

Fairness. The issue of fairness has been extensively investigated. Lehmann et al. 
introduced three notions of fairness that are useful in practice [21]. An infinite execu- 
tion is unconditionally fair if every transition is taken infinitely often, strongly fair if 
for every transition, if it is enabled infinitely often it is executed infinitely often, and 
weakly fair if for every transition, if it is enabled continuously from some point on, it 
is taken infinitely often. The term transition can be substituted by process or action to 
obtain the same fairness conditions with respect to processes [8] or actions [20]. 
Weak, strong, and unconditional fairness are also referred to as justice, fairness (or 
compassion) and impartiality. Based on these definitions, our assumption of fair 
choice corresponds to strong fairness with respect to the system transitions. Different 
notions of fairness are appropriate for different system models. Apt et al. [3] present 
some criteria of effectiveness and utility of adopting some notion of fairness in a com- 
putational model. 

Queille and Sifakis [27] stress the importance of defining fairness with respect to 
specific actions or predicates of the system, which they call relative fairness. Natara- 
jan and Cleaveland [25] take such an approach, and propose a notion of weak fairness 
with respect to success, in order to determine when a process passes a test. The 
framework presented by Manna and Pnueli [24] supports the specification of weak and 
strong fairness with respect to specific system transitions. 
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A way of dealing with fairness in model checking is to add Biichi acceptance con- 
ditions to the system. For example in [1], all components of the system are Biichi 
automata, and therefore only executions that are acceptable by the product Biichi 
automaton are checked for correctness. Gribomont and Wolper [16] describe how a 
Biichi automaton can be used to express a fair process scheduler. Clarke et al. [8] 
extend their model with a set of predicates, so that fair paths are defined as paths in 
which each predicate holds infinitely often. This is equivalent to turning the model of 
the system into a generalised Biichi automaton. In this way, they can express both 
weak and unconditional fairness on processes. However, this requires the user to 
modify the initial model of the system. Finally, in Unity [5], the notion of fairness 
requires that every statement is selected infinitely often in any infinite execution. 

Priority. Priority has been introduced as a means of assigning more importance to 
some actions than others. Examples of actions that require special treatment are inter- 
rupts and timeouts. In [26], Phillips performs a study and comparison between various 
approaches to introducing priority in process algebra. Relative vs. absolute and condi- 
tional vs. exclusive forms of priority appear in the literature. Recently, dynamic prior- 
ity has also been proposed in the context of real-time systems [4]. In our approach, 
priority is not used as a modelling operator. Rather, it is simply used as a way of 
eliminating transitions, and obtaining system executions that would otherwise be con- 
sidered unfair. Therefore, we do not need to consider whether the semantic equiva- 
lence of our model remains a congruence with the introduction of a priority scheme. 
As a result, we have taken a very simple approach to priority, similar to the initial one 
proposed by Cleaveland and Hennessy in [10]. 



6 Discussion and Conclusions 

The work presented in this paper was motivated by a desire to achieve a balance be- 
tween expressive power, accessibility and efficiency of analysis methods. Despite 
their expressive power, Biichi automata may exacerbate the state explosion problem. 
Moreover, they are not easy to specify without the use of an automated tool [19]. In 
general, this approach to verification is appropriate for experienced users of an analy- 
sis tool, that can use effectively a formalism like LTL or Biichi automata to specify 
properties or fairness assumptions of the system. The effort of using such a mecha- 
nism should only be required by the user if no simpler method is available for per- 
forming the specific analysis of interest. 

In general, methods should require minimal effort before engineers start realising 
the benefits from their use [9]. The progress checking mechanism that we propose 
provides a way of checking liveness in a system, which is easily accessible by non- 
experts. Although less expressive than LTL and Biichi automata, progress properties 
can be specified in a simple intuitive way, and can be checked on the unmodified LTS 
of the system. In the context of CRA, progress properties are specified independently 
of the processes and composite subsystems that form a system. Consequently, they can 
be applied meaningfully to a subsystem as well as to the composite system as long as 
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the subsystem contains the progress actions in its alphabet. A single traversal of the 
LTS of a system is sufficient to check any number of progress properties. 

In our framework, progress and safety properties can be combined efficiently, and 
checked simultaneously. Therefore, users need to revert to LTL model-checking only 
for restricted classes of liveness properties. Our experience so far in analysing archi- 
tectural models leads us to believe that progress properties are sufficiently expressive 
to allow many liveness properties, of interest at the software architecture level, to be 
verified. For example, we applied our technique to a model of an Active Badge Sys- 
tem with 566,820 states and 2,428,488 transitions [22], and showed that badge com- 
mands are not acknowledged if badges move between locations too frequently. 

The combination of progress checks and action priority provides an elegant way of 
dealing with models that incorporate a notion of discrete time. The passing of time is 
modelled as a global tick action [28]. The maximal progress condition that is usually 
assumed for these discrete time models is ensured by making the tick action low pri- 
ority: “>> { tick}”. The integrity of the model with respect to time can be checked 
by asserting the progress property “progress TIME = {tick}”. We have used 
these principles to construct and check a discrete time model for a Bounded Retrans- 
mission Protocol used in one of Philips’ products. 

In their work on patterns in property specifications [II], Dwyer et. al report that the 
most common property pattern is Response, described in LTL as □(a^ b). Our prog- 
ress and conditional progress schemes cover a wide range of properties that fall in this 
category. For example, when nOa holds in a system, n(a=S> b) reduces to the condi- 
tional progress property “progress Response = if {a} then {b}”. 

The proposed fairness assumption has been elegantly incorporated in all our live- 
ness-checking mechanisms [13] (though, in this paper, it was presented in the context 
of progress). We found that the notion of fairness with respect to transitions fits more 
naturally with our framework. In the context of CRA, it is not easy to apply fairness 
with respect to processes of the system, because the LTS of a composite system does 
not retain information about which processes it consists of. This could only be 
achieved by modifying the LTSs of the system components to record all necessary 
information, similarly to the approach proposed by Clarke et al. in [8]. 

In the context of liveness property checking, the possibility of including a notion of 
fairness is essential. When Biichi automata are used to express fairness constraints, 
users not familiar with the formalism are unable to check their model under any fair- 
ness conditions. In such cases, most of the counterexamples returned by the checking 
procedure correspond to unrealistic executions of the system analysed. As model 
checkers return a single counterexample for a property violation, the user has no way 
of finding out if the property checked is really violated, unless the counterexample is 
realistic. We believe that, rather than checking liveness with no fairness constraints 
and obtaining misleading violations, it is preferable from the developer’s point of view 
to get only realistic results from the tool, even at the risk of missing problems that may 
occur in practice. 

The advantage of action priority is that it is simple to model, and the LTS of the 
system is automatically updated accordingly. The user can therefore easily experiment 
with checking various instances of the system behaviour, by applying different priori- 
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ties to it. As a result, the coverage of the checking mechanism under fair choice can be 
increased. This process is guided by users, who may enforce adverse scheduling con- 
ditions based on their intuition about vulnerable parts of the system behaviour. 

In the context of CRA, action priority is applied to produce subsystem versions 
solely for checking progress at the subsystem level. These “test” subsystems are not 
used in constructing composite behaviours, since the application of action priority 
removes parts of system behaviour. In our implementation, action priority is applied 
during the construction of a composite LTS from component processes. Therefore, 
action priority can also be used for performing partial searches on systems that are too 
large for exhaustive exploration. In these cases, action priority provides a way of se- 
lecting interesting behaviours for analysis. The current priority scheme allows only 
coarse-grained control of scheduling. To refine this control, we plan to investigate the 
use of more powerful priority schemes, such as relative and dynamic action priorities. 
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